Hi,
I'm working on a plant genome and I'm trying to assemble it with NECAT, but the final assembly I obtain is really inconsistent.
The expected genome size is 1.2 Gbp and I'm working with Oxford Nanopore reads. The starting data for the assembly are reported in the following table:
| Number of reads |
1,341,399 |
| Number of bases (bp) |
33,136,270,559 |
| Average read length (bp) |
24,703 |
| Reads N50 (bp) |
40,677 |
| Expected fold-coverage |
28x |
The obtained results are the following:
| |
NECAT v.0.0.1 |
| Total assembly size (bp) |
604,869 |
| Num. Contigs |
12 |
| Contigs average length (bp) |
50,406 |
| N50 (bp) |
153,041 |
| N90 (bp) |
17,942 |
| Longest contig (bp) |
154,607 |
The command I run was
/opt/NECAT/Linux-amd64/bin/necat.pl assemble config.txt
and the config file was compiled as it follows:
PROJECT=Plant_genome
ONT_READ_LIST=read_list.txt
GENOME_SIZE=1200000000
THREADS=15
MIN_READ_LENGTH=3000
PREP_OUTPUT_COVERAGE=28
OVLP_FAST_OPTIONS=-n 500 -z 20 -b 2000 -e 0.5 -j 0 -u 1 -a 1000
OVLP_SENSITIVE_OPTIONS=-n 500 -z 10 -e 0.5 -j 0 -u 1 -a 1000
CNS_FAST_OPTIONS=-a 2000 -x 4 -y 12 -l 1000 -e 0.5 -p 0.8 -u 0
CNS_SENSITIVE_OPTIONS=-a 2000 -x 4 -y 12 -l 1000 -e 0.5 -p 0.8 -u 0
TRIM_OVLP_OPTIONS=-n 100 -z 10 -b 2000 -e 0.5 -j 1 -u 1 -a 400
ASM_OVLP_OPTIONS=-n 100 -z 10 -b 2000 -e 0.5 -j 1 -u 0 -a 400
NUM_ITER=1
CNS_OUTPUT_COVERAGE=28
CLEANUP=1
USE_GRID=true
GRID_NODE=8
GRID_OPTIONS=
SMALL_MEMORY=0
FSA_OL_FILTER_OPTIONS=
FSA_ASSEMBLE_OPTIONS=
FSA_CTG_BRIDGE_OPTIONS=
POLISH_CONTIGS=true
I would like to understand why the assembly obtained is so poor and how can I improve it. Maybe the parameters used for this dataset are inadequate?
Hi,
I'm working on a plant genome and I'm trying to assemble it with NECAT, but the final assembly I obtain is really inconsistent.
The expected genome size is 1.2 Gbp and I'm working with Oxford Nanopore reads. The starting data for the assembly are reported in the following table:
The obtained results are the following:
The command I run was
/opt/NECAT/Linux-amd64/bin/necat.pl assemble config.txtand the config file was compiled as it follows:
I would like to understand why the assembly obtained is so poor and how can I improve it. Maybe the parameters used for this dataset are inadequate?