Skip to content

staar: heterogeneous MetaSTAAR, SCANG bridge + MC threshold#130

Merged
vineetver merged 1 commit intomasterfrom
staar/tranche-d-e
Apr 16, 2026
Merged

staar: heterogeneous MetaSTAAR, SCANG bridge + MC threshold#130
vineetver merged 1 commit intomasterfrom
staar/tranche-d-e

Conversation

@vineetver
Copy link
Copy Markdown
Owner

@vineetver vineetver commented Apr 16, 2026

Heterogeneous MetaSTAAR (#72) persists per-study U via a new DataFusion study_us column in merge_chromosome and adds MetaVariant.u_study so the conditional scoring path can Schur-condition each study independently before summing. Mirrors MetaSTAAR R/MetaSTAAR_merge_cond.R:391-404. The heterogeneous reject at commands/meta_staar.rs is deleted.

SCANG bridge (#110) adds staar::scang::ScangExt on NullModel with pre-sampled pseudo-residuals from N(0, P) via projection sampling. Gaussian unrelated path only; kinship and binary gates reject until the Cholesky bridge lands. Source: STAARpipeline R/staar2scang_nullmodel.R:23-34.

SCANG MC threshold (#79) wires pseudo-residuals into per-chromosome SCANG scoring. For each window, per-sim burden statistic is scored against the cached K. The 1-alpha quantile of the resulting max-stat distribution, floored at -log10(1e-4), filters windows before emission. emthr column is written alongside STAAR-O on every mask output parquet. Source: SCANG R/SCANG.r:181-205.

Closes #72
Closes #110
Closes #79

… threshold

Three additions in one commit:

Heterogeneous MetaSTAAR (#72) persists per-study U in the DuckDB merge
via a new study_us column and adds MetaVariant.u_study so the
conditional path at meta.rs:meta_score_gene_conditional can Schur-
condition each study independently before summing the per-study
(U_cond_i, K_cond_i). Mirrors MetaSTAAR R/MetaSTAAR_merge_cond.R:391-
404. The config rejection for --conditional-model heterogeneous is
deleted.

SCANG null-model bridge (#110) adds staar::scang::ScangExt on
NullModel carrying pre-sampled pseudo-residuals from N(0, P) via
projection sampling (no eigendecomp). Source: STAARpipeline
R/staar2scang_nullmodel.R:23-34 for the unrelated gaussian path.
Kinship and binary gates reject until the Cholesky factor bridge lands.

SCANG MC empirical threshold (#79) wires the pseudo-residuals into the
per-chromosome SCANG scoring at scoring.rs:compute_scang_threshold.
For each window, per-sim burden U = G' z_j is scored against the
cached K to produce a null distribution of max -log(p). The 1-alpha
quantile (default alpha=0.05) floored at -log(1e-4) filters windows
before emission. emthr column is written on every mask output parquet
alongside STAAR-O.
@vineetver vineetver merged commit d155c6f into master Apr 16, 2026
3 checks passed
@vineetver vineetver deleted the staar/tranche-d-e branch April 16, 2026 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant