Skip to content

feat(speclib): Skyline transition list CSV support + speclib_build_cli bundle#55

Merged
jspaezp merged 24 commits intomainfrom
feat/skyline-speclib
Apr 16, 2026
Merged

feat(speclib): Skyline transition list CSV support + speclib_build_cli bundle#55
jspaezp merged 24 commits intomainfrom
feat/skyline-speclib

Conversation

@jspaezp
Copy link
Copy Markdown
Collaborator

@jspaezp jspaezp commented Apr 16, 2026

Summary

  • Adds Skyline Peptide Transition List CSV as a spectral library source. New skyline_io module mirrors the DIA-NN/Spectronaut pattern (sniff + group-by-precursor → TimsElutionGroup<IonAnnot> + SkylinePrecursorExtras), wired through timsquery dispatch, timsquery_viewer extras, and timsseek speclib converter.
  • Bundles 19 prior commits from local main: the speclib_build_cli Rust CLI (digest → dedup → expand → Koina predict → write msgpack.zst), Prosit_2023_intensity_timsTOF default, iRT-library RT tolerance default (Unrestricted), and related fixes.

Skyline specifics

  • Precursor-isotope rows (Fragment Ion Type == "precursor") skipped; envelope computed from sequence downstream.
  • #N/A Library Intensity → default 1.0.
  • RT / IM columns absent in Skyline exports → default 0.0, loud warning (use Unrestricted RT tolerance).
  • Modifications in [...] stripped for stripped_peptide; modified_peptide preserved verbatim.

Test plan

  • cargo test -p timsquery --lib serde:: — 19 tests pass (6 new skyline_io tests)
  • cargo test -p timsseek --lib speclib — 12 tests pass (test_load_skyline_csv_library new)
  • cargo build -p timsquery -p timsseek -p timsquery_viewer clean
  • Manual smoke test against a real Skyline export on a DIA run

jspaezp added 24 commits April 15, 2026 21:20
…r msgpack.zst

Add pub constructors (new()) to PrecursorEntry, ReferenceEG, and SerSpeclibElement
so external crates can construct speclib entries. Make PrecursorEntry pub. Add
SpeclibWriter<W> wrapping zstd::Encoder for streaming msgpack.zst output. Re-export
all four types from data_sources. Two roundtrip tests cover both rmp_serde encode/decode
and the full writer→reader pipeline.
- clap CLI with all args + TOML config merge (CLI > TOML > defaults)
- Proforma mod parser: fixed + variable mod application
- Bloom + bucket HashMap peptide dedup
- REVERSE + EDGE_MUTATE decoy strategies
- Example config at repo root
- lib.rs re-exports for integration test access
- 3 integration tests: digestion+dedup, mod chain, decoy roundtrip
- Taskfile: speclib:build, speclib:local-koina, speclib:stop-koina
- Prosit expects bare AA sequences, strip bracket mods before sending
- Convert Koina annotation format (y1+1) to mzPAF format (y1^1)
- Make strip_mods pub for pipeline use
- Precursor m/z now computed from modified sequence (includes +57 for Cys carbam)
- Koina receives modified sequences with [UNIMOD:N] notation (Prosit handles them)
- to_proforma converts [U:N] → [UNIMOD:N] (rustyms parses [U:N] as element Uranium)
- Isotope distribution also uses modified formula
- Add Prosit_2023_intensity_timsTOF and Prosit_2020_intensity_CID to model registry
- Default fragment model now timsTOF-specific (was HCD)
- Same Triton v2 schema, drop-in compatible
- Tolerance::default() now uses RtTolerance::Unrestricted (was Minutes(5,5))
- Prescore extracts full RT range, calibration maps library→observed RT
- Bench config drops explicit RT tolerance (defaults to unrestricted)
- Users with calibrated libraries can still set "rt": {"minutes": [N, N]} in config
New skyline_io module parses Skyline Peptide Transition List exports
as a spectral library source, mirroring the DIA-NN/Spectronaut pattern
(sniff + group-by-precursor + emit TimsElutionGroup<IonAnnot> plus
SkylinePrecursorExtras). Wires through timsquery's dispatch,
timsquery_viewer's extras view, and timsseek's speclib converter
(reusing convert_diann_to_query_item via a small extras adapter).

Notable behaviors:
- Precursor-isotope rows (Fragment Ion Type == "precursor") are
  skipped; downstream computes the envelope from the sequence.
- `#N/A` Library Intensity values default to 1.0.
- RT/IM columns are absent in Skyline exports; defaults to 0.0
  with a loud warning (use Unrestricted RT tolerance).
- Modifications in `[...]` are stripped for the stripped_peptide
  field; modified_peptide is preserved verbatim.
…d_cli

Drops the Python speclib_builder package and all references (root
pyproject workspace/deps, hatch packages, uv sources, uv.lock).
The Rust speclib_build_cli now owns the digest → dedup → expand →
Koina predict → write pipeline end-to-end.
No notebooks in the repo and no other references to jupyter. Prunes
~100 transitive deps from uv.lock.
@jspaezp jspaezp merged commit 881b67c into main Apr 16, 2026
2 checks passed
@jspaezp jspaezp deleted the feat/skyline-speclib branch April 16, 2026 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant