Skip to content

Conversation

@ammaraziz
Copy link

@ammaraziz ammaraziz commented Dec 10, 2025

Implement nextclade on a per organism basis (not per sample). One issue with this PR is testing, should subworkflows require separate testing?

Please let me know what you think and if you suggest any changes.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core pipelines lint).
    WARNINGS are produced for many of the pipeline files - For the new subworkflow warning is related to outputting a version, which does happen.
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
    Haven't added anything to change log - it's not up to date?
  • README.md is updated (including new tool citations and authors/contributors).

@ammaraziz ammaraziz changed the title Dev nextclade Nextclade Subworkflow Dec 10, 2025
Copy link
Member

@BioWilko BioWilko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good start for nextclade support but I think it needs to be integrated into the pipeline a bit better before we can merge, specifically I think nextclade results need to be nicely included in the run / sample reports when run.

The docs also need to make it very clear that the nextclade functionality doesn't support multi virus runs / multi pathogen schemes

- Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline.
- Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`.
- Parameters used by the pipeline run: `params.json`.
- Clade/Lineages are output as a tsv: `lineages.tsv`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have its own section with more details

@ammaraziz
Copy link
Author

Thanks Sam for the feedback, apologies for some of those changes. It was a bad merge on my end.

I'll work on making the changes.

ammaraziz and others added 3 commits December 11, 2025 10:28
Revamp nextclade subworkflow, changes

    HTML output
    Better handling of optional inputs for GENERATE_RUN_REPORT
    Moved nextclade into amplicon-nf
    Capture nextclade
    Removed up unused params



* chore: format imports correctly

* chore: remove channel for nextclade tag

* chore: rename module

* chore: rename subworkflow

* add lineage tab into the sample run report

* chore: first attempt implement html output for nextclade

* chore: add nextclade table to template

* chore: add nextclade for file renaming

* chore: rename nextclade tags to be simple

* chore: simplify nextclade data parsing into payload

* chore: WORK IN PROGRESS - adding optional nextclade channel

* chore: move nextclade workflow into amplicon-nf.nf

* add nextclade tsv to inputs

* chore: simplify code

* chore: first working version of new nextclade workflow

* chore: added NO_FILE to handle optional input

* feat: nextclade html output

* chore: simplify run report code

* chore: remove unused param

* chore: remove nextclade settings; tsv output not produced

* chore: remove fasta_header_extra param

* chore: output software version

* chore: capture nextclade version

---------

Co-authored-by: Josh Zhang <zjs0317@gmail.com>
@ammaraziz
Copy link
Author

I've pushed a bunch of code and improvements. I haven't updated the documentation, something I'm going to ping you on slack to discuss where I should add more information.

from jinja2 import Environment, FileSystemLoader, select_autoescape
from primalbedtools.amplicons import create_amplicons
from primalbedtools.bedfiles import BedLine
from primalbedtools.scheme import Scheme
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I had linter (ruff) on when working on this python script, so it moved a bunch of the imports around. It also changed some quotations from double to singular. Hope that's okay?

[meta, bed, depth_tsvs, amp_depth_tsvs, coverage_tsvs, [], samplesheet_csv]
}
}
// massage optional inputs
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote this code to to take in more inputs without an if/else statement to accommodate the optional nextclade input. However, I ran into a problem where I just couldn't get the channel to be optionally taken into the GENERATE_RUN_REPORT process. I solution I ended up with was the assets/NO_FILE trick documented here.

While the code doesn't handle nextclade input, it still better than the if else statement. I can revert if you prefer the other version.

}
// massage optional inputs
ch_msas_opt = params.primer_mismatch_plot ? ch_msas_by_scheme : channel.empty()
ch_nextclade_opt = params.nextclade ? ch_nextclade_tsv : channel.fromPath("$projectDir/assets/NO_FILE")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ch_nextclade_opt = params.nextclade ? ch_nextclade_tsv : channel.fromPath("$projectDir/assets/NO_FILE")
ch_nextclade_opt = params.nextclade ? ch_nextclade_tsv : channel.empty()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen the empty file trick before and I dislike it, seems hacky, there's ways to handle it inside nextflow.

depth,
amp,
cov,
msas ?: [],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pack the nextclade TSV in here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants