Skip to content

staar: add metasvm_pred + genehancer columns to annotation parquet#126

Merged
vineetver merged 1 commit intomasterfrom
staar/107-metasvm-genehancer-columns
Apr 16, 2026
Merged

staar: add metasvm_pred + genehancer columns to annotation parquet#126
vineetver merged 1 commit intomasterfrom
staar/107-metasvm-genehancer-columns

Conversation

@vineetver
Copy link
Copy Markdown
Owner

Closes #107. STAARpipeline masks disruptive_missense / plof_ds / ptv_ds all key off MetaSVM_pred=="D". We were proxying with cadd_phred + revel, so our masks don't match R. GeneHancer is an opaque ID string the pipeline passes through for downstream tools.

Pulls both from FAVOR full-tier (a.dbnsfp.metasvm_pred, a.genehancer.id) and wires them through ingest -> cohort parquet -> VariantIndexEntry -> AnnotatedVariant -> MetaSTAAR sumstats. MetaSvmPred is a typed enum so A3's mask flip is a pattern match rather than a string compare.

Preflight adds require_structural_annotation_catalog next to the 11-weight lock; old cohorts get a clear rebuild message. No mask math changes yet — that's #77.

cargo test: 294/294, clippy clean.

)

STAARpipeline's coding masks (disruptive_missense, plof_ds, ptv_ds) key off
MetaSVM_pred=="D". We carried cadd_phred + revel as proxies, which means our
masks don't match R. GeneHancer is an opaque identifier string the pipeline
passes through for downstream tooling.

Pulls both from FAVOR full-tier (a.dbnsfp.metasvm_pred and a.genehancer.id),
flows them end-to-end: ingest -> cohort parquet -> VariantIndexEntry ->
AnnotatedVariant -> MetaSTAAR sumstats schema. MetaSVM parses into a typed
enum (Deleterious/Tolerated/Unknown) so A3's mask-predicate flip is a
pattern match, not a string compare. GeneHancer stays Box<str>; no predicate
reads it today.

Preflight gains require_structural_annotation_catalog alongside the 11-weight
catalog lock from #104. Fails loud at staar start if either column is missing
from the cohort, so users with an old cohort get a clear rebuild message.

No mask math changes; that's #77.
@vineetver vineetver merged commit ae33477 into master Apr 16, 2026
3 checks passed
@vineetver vineetver deleted the staar/107-metasvm-genehancer-columns branch April 16, 2026 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add MetaSVM and structural annotation channels to annotation contract

1 participant