Skip to content

QUAST QC step fails when running de novo assembly without reference genome #178

@wbau

Description

@wbau

Description of the bug

Bug Summary

When running genomeassembler with --use_ref false for de novo assembly, the QUAST QC process is silently skipped due to a channel joining issue in the workflow logic. This also causes the final REPORT step to fail.

Current Behavior

The QUAST module itself (modules/local/quast/main.nf) correctly handles no-reference mode through conditional parameters:

def features = use_gff ? "--features ${gff}" : ''
def reference = use_fasta ? "-r ${fasta}" : ''
def reference_bam = params.use_ref ? "--ref-bam ${ref_bam}" : ''

However, the workflow in subworkflows/local/qc/quast/main.nf always attempts to join the inputs_references (containing ref_fasta and ref_gff) and aln_to_ref channels, even when they are empty/null:

inputs
    .map { row -> [row.meta, row.ref_fasta, row.ref_gff] }
    .set { inputs_references }

assembly
    .join(inputs_references)     // <- This fails when ref_fasta/ref_gff are null
    .join(aln_to_ref)            // <- This is also empty when use_ref=false
    .join(aln_to_assembly)
    .set { quast_in }

This causes the channel join to fail, and QUAST never executes.

Expected Behavior

QUAST should generate assembly statistics even without a reference genome, as the module already supports this functionality.

Workaround / Example Solution

Add conditional logic to handle de novo mode in subworkflows/local/qc/quast/main.nf:

if (params.quast) {
    if (params.use_ref) {
        // Original behavior with reference
        inputs
            .map { row -> [row.meta, row.ref_fasta, row.ref_gff] }
            .set { inputs_references }

        assembly
            .join(inputs_references)
            .join(aln_to_ref)
            .join(aln_to_assembly)
            .set { quast_in }
    }
    else {
        // De novo mode: skip reference channels
        assembly
            .join(aln_to_assembly)
            .map { meta, consensus, bam ->
                [meta, consensus, [], [], [], bam]
            }
            .set { quast_in }
    }
    
    QUAST(quast_in, params.use_ref, false)
    QUAST.out.results.set { quast_results }
    QUAST.out.tsv.set { quast_tsv }
    QUAST.out.versions.set { versions }
}

Testing

I've tested this fix with: ONT + Illumina hybrid assembly, with '--use_ref false' and '--lift_annotations false'. QUAST now successfully runs and generates assembly statistics.

Command used and terminal output

nextflow run nf-core/genomeassembler \
  --input samplesheet.csv \
  --ont \
  --short_reads \
  --assembler flye \
  --use_ref false \
  --lift_annotations false \ 
  --genome_size 20000000 \
  --quast \
  ...

Relevant files

No response

System information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions