Issue assembling plant genome with NECAT

Hi,
I'm working on a plant genome and I'm trying to assemble it with NECAT, but the final assembly I obtain is really inconsistent. 
The expected genome size is 1.2 Gbp and I'm working with Oxford Nanopore reads. The starting data for the assembly are reported in the following table: 

<html xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=PowerPoint.Slide>
<meta name=Generator content="Microsoft PowerPoint 15">


</head>

<body>




Number of reads | 1,341,399
-- | --
Number of bases (bp) | 33,136,270,559
Average read   length (bp) | 24,703
Reads   N50 (bp) | 40,677
Expected fold-coverage | 28x




</body>

</html>

The obtained results are the following:


<html xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=PowerPoint.Slide>
<meta name=Generator content="Microsoft PowerPoint 15">

</head>

<body>




  | NECAT   v.0.0.1
-- | --
Total assembly size (bp) | 604,869
Num. Contigs | 12
Contigs average length (bp) | 50,406
N50 (bp) | 153,041
N90 (bp) | 17,942
Longest contig (bp) | 154,607




</body>

</html>

The command I run was
`/opt/NECAT/Linux-amd64/bin/necat.pl assemble config.txt`
and the config file was compiled as it follows:

> PROJECT=Plant_genome
ONT_READ_LIST=read_list.txt
GENOME_SIZE=1200000000
THREADS=15
MIN_READ_LENGTH=3000
PREP_OUTPUT_COVERAGE=28
OVLP_FAST_OPTIONS=-n 500 -z 20 -b 2000 -e 0.5 -j 0 -u 1 -a 1000
OVLP_SENSITIVE_OPTIONS=-n 500 -z 10 -e 0.5 -j 0 -u 1 -a 1000
CNS_FAST_OPTIONS=-a 2000 -x 4 -y 12 -l 1000 -e 0.5 -p 0.8 -u 0
CNS_SENSITIVE_OPTIONS=-a 2000 -x 4 -y 12 -l 1000 -e 0.5 -p 0.8 -u 0
TRIM_OVLP_OPTIONS=-n 100 -z 10 -b 2000 -e 0.5 -j 1 -u 1 -a 400
ASM_OVLP_OPTIONS=-n 100 -z 10 -b 2000 -e 0.5 -j 1 -u 0 -a 400
NUM_ITER=1
CNS_OUTPUT_COVERAGE=28
CLEANUP=1
USE_GRID=true
GRID_NODE=8
GRID_OPTIONS=
SMALL_MEMORY=0
FSA_OL_FILTER_OPTIONS=
FSA_ASSEMBLE_OPTIONS=
FSA_CTG_BRIDGE_OPTIONS=
POLISH_CONTIGS=true

I would like to understand why the assembly obtained is so poor and how can I improve it. Maybe the parameters used for this dataset are inadequate?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue assembling plant genome with NECAT #47

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Number of reads	1,341,399
Number of bases (bp)	33,136,270,559
Average read length (bp)	24,703
Reads N50 (bp)	40,677
Expected fold-coverage	28x

	NECAT v.0.0.1
Total assembly size (bp)	604,869
Num. Contigs	12
Contigs average length (bp)	50,406
N50 (bp)	153,041
N90 (bp)	17,942
Longest contig (bp)	154,607

Issue assembling plant genome with NECAT #47

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions