Skip to content

Support PRIDE / SDRF input#445

Merged
jonasscheid merged 9 commits intonf-core:devfrom
jonasscheid:feature/sdrf-pride-input
Apr 22, 2026
Merged

Support PRIDE / SDRF input#445
jonasscheid merged 9 commits intonf-core:devfrom
jonasscheid:feature/sdrf-pride-input

Conversation

@jonasscheid
Copy link
Copy Markdown
Collaborator

@jonasscheid jonasscheid commented Apr 12, 2026

PR to fix #277

Key implementations:

  • Provide PRIDE id only (if SDRF is present in PRIDE archive) or provide SDRF containg paths to the pride archive.
  • Automatically fetch data via pridepy
  • Convert SDRF to mhcquant samplesheet + new search preset tsv via sdrf-pipelines export function
  • Seamlessly plug data and metadata into the exiting channels

Steps happen during the PIPELINE_INITIALISATION subworkflow.
Tested successfully on PXD009752 (2 raw files), and PXD004894 (140 raw files)

Edit: WIP figuring out which Comet resource limits fit best for CI runners for the sdrf test
Edit edit: New test runs through, but no space on device. Will shrink sdrf sheet to only 1 file instead of 2 after review

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/mhcquant branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@nf-core-bot
Copy link
Copy Markdown
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@ypriverol
Copy link
Copy Markdown
Member

@have you tested pridepy with aspera in production? How does it works?

@jonasscheid
Copy link
Copy Markdown
Collaborator Author

jonasscheid commented Apr 20, 2026

@ypriverol I did (on my VM (IPv6)) for 140 raw files:

Protocol Wall time (140 files, 5-way parallel) Avg per file Effective throughput
FTP 58 min (3498 s) 122 s ~50 MB/s aggregate
Aspera 146 min (8778 s) 308 s ~20 MB/s aggregate

@jonasscheid jonasscheid requested a review from ypriverol April 20, 2026 12:36
Copy link
Copy Markdown
Contributor

@d4straub d4straub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Nothing serious, but I dislike the resource allocations in conf/modules.conf.
And the usage.md might benefit from input documentation.
About Edit edit: New test runs through, but no space on device. Will shrink sdrf sheet to only 1 file instead of 2 after review: Afaik its possible to increase the size of the runner disk so that this wont happen (dont remember details, somewhere in the nf-core slack I think) but if you can adjust test files, thats preferable ofc.

I approve and think you will do whats necessary, if anything.

Comment thread conf/modules.config Outdated
Comment thread conf/modules.config Outdated
Comment thread subworkflows/local/sdrf_to_samplesheet/main.nf
Comment thread tests/sdrf.nf.test.snap Outdated
Comment thread nextflow_schema.json
…put modes

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jonasscheid
Copy link
Copy Markdown
Collaborator Author

Thanks @d4straub !

faik its possible to increase the size of the runner disk so that this wont happen (dont remember details, somewhere in the nf-core slack I think) but if you can adjust test files, thats preferable ofc.

I think thats when nf-core comes into p(l)ay :D. Not sure if that would be needed in this setting. I'll try to reduce the test input further: Could we try this? nf-core/test-datasets#2007

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jonasscheid jonasscheid merged commit 1488e0f into nf-core:dev Apr 22, 2026
20 checks passed
This was referenced Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants