Conversation
|
| "errorMessage": "PacBio Index file for BAM subreads cannot contain spaces and must have extension '.bam.pbi' or being empty" | ||
| }, | ||
| "reads": { | ||
| "start_from": { |
There was a problem hiding this comment.
I think we can find a better name for this field
There was a problem hiding this comment.
I have been scratching my head for naming this.
I used entrypoint, but I didn't felt it was very clear for users.
'start_from' is might not be the best, at least, it's very clear on it's signification.
Do you have propositions?
|
I added a subworflow to chunk the fasta files (from lima, isoseq refine and mapping start) before the mapping steps. |
|
A new update of the CHUNKER to apply it twice in the pipeline. |
PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).Isoseq providers deliver sequences in many different format depending of the pre-processing they apply (Subreqds, CCS, Full Length isoseq). This even more true with the new MAS-seq.
I had implemented the possibility to deals with these format through options. However, their usage along with the possibility to skip ISOseq processing and align made the samplesheet and the usage of the pipeline complex.
In this PR, I changed the way to inject input sequences into the pipeline. Now, it's possible start analysis from ccs, lima, isoseq refine or at the mapping step. The different types of inputs can be even mixed in the samplesheet.
This modification simplify the usage but also the code.
It not necessary to deals with the different entrypoints any more. The inputs files are injected at the right moment in the main channel paths.