Skip to content

Conversation

@jcharkow
Copy link
Collaborator

@jcharkow jcharkow commented Aug 19, 2025

Description

This adds a reader for OpenSWATH XICs in .parquet format in the new pyprophet. Specifically this relies on the parquet export here PyProphet/pyprophet#158 since it requires transition annotation columns.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Tests have been added.
API Documentation has been updated.
I do not believe any additional documentation is required because it is the same format as the other readers.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Contents (#188)

Other

  • add reader for OSW parquet chromatograms
  • update api docs
  • assemble transitionGroups rather than chromatogram list
  • fix: update tests and control for when no chromatograms found

Uncategorised!

  • add fragment ion annotation reading
  • remove unneeded imports
  • add tests for OSW parquet XICs
  • add readme on how chromatograms parquet were created
  • apply copilot suggestions
  • add util method for numpress decompression
  • Merge branch 'dev' into osw_parquet_chrom_reader
  • Merge branch 'dev' into osw_parquet_chrom_reader
  • apply singjc suggestions

@singjc singjc requested a review from Copilot August 19, 2025 17:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new reader for OpenSWATH XICs (Extracted Ion Chromatograms) stored in parquet format, specifically designed to work with parquet exports from pyprophet that include transition annotation columns.

Key changes:

  • Implements OpenSwathXICParquetLoader and OpenSwathXICParquetAccess classes for reading parquet-formatted chromatograms
  • Adds comprehensive test coverage for the new functionality
  • Updates API documentation to include the new loader

Reviewed Changes

Copilot reviewed 8 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
massdash/loaders/access/OpenSwathXICParquetAccess.py Core access class for reading parquet chromatogram data with compression handling
massdash/loaders/OpenSwathXICParquetLoader.py Loader class implementing the chromatogram loading interface
test/loaders/test_OpenSwathXICParquetLoader.py Tests for the parquet loader functionality
test/loaders/access/test_OpenSwathXICParquetAccess.py Tests for the parquet access class
massdash/loaders/__init__.py Export registration for the new loader
massdash/loaders/access/__init__.py Export registration for the new access class
docs/API.rst Documentation updates for the new classes

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jcharkow
Copy link
Collaborator Author

@singjc can I merge this?

Copy link
Collaborator

@singjc singjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I made some minor suggestions which you can commit, and then had one question about returning None instead of throwing a message telling the user there was not XIC data for a requested peptide charge.

You can go ahead an merge

Comment on lines +68 to +69
if len(precursorChroms) == 0 and len(transitionChroms) == 0: # do not create a transition group if there are no chromatograms
return None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an instance where we would want to return none, instead of throwing a message stating that no data was found for this peptide precursor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the behaviour that is used in the SqMassLoader so I did it this way for consistency. I believe I implemented it this way because an empty Datatframe is returned for the loadTransitionGroupDf() method so this is somewhat consistent with that.

I could change the behaviour for both of them if you think that would be more appropriate.

if transitionMetaInfo.empty:
return None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I will merge with the currently functionality for now (pending tests passing) and we I can create a separate PR with the new behaviour if you think we should change it.

@jcharkow jcharkow merged commit c823287 into dev Nov 28, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants