Skip to content

staar: strip FORMAT column in GenotypeWriter::push#135

Merged
vineetver merged 1 commit intomasterfrom
fix-samples-format-prefix
Apr 22, 2026
Merged

staar: strip FORMAT column in GenotypeWriter::push#135
vineetver merged 1 commit intomasterfrom
fix-samples-format-prefix

Conversation

@vineetver
Copy link
Copy Markdown
Owner

noodles_vcf::Record::samples() returns everything after INFO, which
in VCF includes the FORMAT column ("GT" or "GT:AD:DP"). push
was running its memchr loop over the full string, so sample[0]'s slot
got the FORMAT bytes, every real sample shifted by one, and the last
sample was silently dropped. MAF and dosages were wrong on every real
variant.

Fix: skip past the first tab in samples_str before the sample loop.
Three tests cover the bare-GT case, multi-field GT:AD:DP, and the
missing-samples case.

Invariance golden and all ground-truth-vs-R tests unaffected. They
feed U, K, mafs directly from the JSON fixture into
run_staar_from_sumstats; neither path touches GenotypeWriter::push.
Nothing to regenerate. 341/341 cargo test, clippy clean.

Surfaced while writing the VariantReader unit test in #134.

…push

noodles exposes `Samples<'_>` as everything after the INFO field, which
for VCF includes the FORMAT column ("GT" or "GT:AD:DP"). push was
running its memchr loop over the full string, so the FORMAT bytes
landed in sample[0]'s slot, every real sample shifted by one, and the
last sample was dropped. MAF and dosages were wrong on every ingested
variant.

Fix: skip past the first tab before entering the sample-parsing loop.
Three tests added: correct 3-sample mapping with a bare "GT" format,
multi-field "GT:AD:DP" format, and the missing-samples-field case.

Invariance and ground-truth-vs-R tests unaffected (they feed U/K/MAF
directly from JSON fixtures; neither exercises the VCF→MAF path).
@vineetver vineetver merged commit 48bf955 into master Apr 22, 2026
3 checks passed
@vineetver vineetver deleted the fix-samples-format-prefix branch April 22, 2026 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant