-
Notifications
You must be signed in to change notification settings - Fork 3
OpenSWATH XIC parquet reader #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new reader for OpenSWATH XICs (Extracted Ion Chromatograms) stored in parquet format, specifically designed to work with parquet exports from pyprophet that include transition annotation columns.
Key changes:
- Implements
OpenSwathXICParquetLoaderandOpenSwathXICParquetAccessclasses for reading parquet-formatted chromatograms - Adds comprehensive test coverage for the new functionality
- Updates API documentation to include the new loader
Reviewed Changes
Copilot reviewed 8 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
massdash/loaders/access/OpenSwathXICParquetAccess.py |
Core access class for reading parquet chromatogram data with compression handling |
massdash/loaders/OpenSwathXICParquetLoader.py |
Loader class implementing the chromatogram loading interface |
test/loaders/test_OpenSwathXICParquetLoader.py |
Tests for the parquet loader functionality |
test/loaders/access/test_OpenSwathXICParquetAccess.py |
Tests for the parquet access class |
massdash/loaders/__init__.py |
Export registration for the new loader |
massdash/loaders/access/__init__.py |
Export registration for the new access class |
docs/API.rst |
Documentation updates for the new classes |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
This gets rid of duplicate code
|
@singjc can I merge this? |
singjc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I made some minor suggestions which you can commit, and then had one question about returning None instead of throwing a message telling the user there was not XIC data for a requested peptide charge.
You can go ahead an merge
| if len(precursorChroms) == 0 and len(transitionChroms) == 0: # do not create a transition group if there are no chromatograms | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an instance where we would want to return none, instead of throwing a message stating that no data was found for this peptide precursor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the behaviour that is used in the SqMassLoader so I did it this way for consistency. I believe I implemented it this way because an empty Datatframe is returned for the loadTransitionGroupDf() method so this is somewhat consistent with that.
I could change the behaviour for both of them if you think that would be more appropriate.
massdash/massdash/loaders/SqMassLoader.py
Lines 82 to 83 in a0e48a9
| if transitionMetaInfo.empty: | |
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I will merge with the currently functionality for now (pending tests passing) and we I can create a separate PR with the new behaviour if you think we should change it.
Description
This adds a reader for OpenSWATH XICs in .parquet format in the new pyprophet. Specifically this relies on the parquet export here PyProphet/pyprophet#158 since it requires transition annotation columns.
Fixes # (issue)
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Tests have been added.
API Documentation has been updated.
I do not believe any additional documentation is required because it is the same format as the other readers.
Checklist:
Contents (#188)
Other
Uncategorised!