diff --git a/CHANGELOG.md b/CHANGELOG.md
index b654c49..24d9439 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,76 @@
All notable changes to this project will be documented in this file.
+## [0.6.0] - 2026-02-26
+
+### Breaking changes
+- **Removed `set_backend()`, `get_backend_info()`, `reset_backend()`** — only one backend (C++ native) exists since v0.5.0, so the multi-backend API was dead code. Use `from mcpower.backends import get_backend` if you need the backend instance directly
+- **Removed `set_heterogeneity()` and `set_heteroskedasticity()`** — heterogeneity and heteroskedasticity are now controlled exclusively through scenario configurations (`set_scenario_configs()`). The optimistic scenario uses zero perturbation; realistic/doomer scenarios apply these automatically
+- **Removed dead scipy fallback code** from `distributions.py` — scipy was never a runtime dependency since v0.5.0, so the fallback paths were unreachable dead code. The module now cleanly fails with an `ImportError` if the C++ native backend is missing
+- **`_create_power_plot()` returns `fig`** — the function now accepts a `show=True` parameter and always returns the matplotlib figure object. Set `show=False` to suppress `plt.show()` for programmatic use
+- **`apply()` made private (`_apply()`)** — the method is now `_apply()` and called automatically by `find_power()` / `find_sample_size()`. Direct calls should use `model._apply()` instead
+- **`[all]` extra no longer includes `statsmodels`** — use `pip install mcpower[lme]` to get statsmodels for mixed-effects models
+
+### Added
+- **`test_formula` parameter** on `find_power()` and `find_sample_size()` — test a reduced model against data generated from the full model to evaluate power under model misspecification. For example, generate data with `y = x1 + x2 + x3` but test with `test_formula="y ~ x1 + x2"` to see power when `x3` is omitted. Supports interactions, factors, and mixed models.
+- **C++ non-normal residual generation** — scenario perturbations now generate heavy-tailed (Student-t) and skewed (chi-squared) residuals directly in C++ via `residual_dist`/`residual_df` parameters in `generate_y()`, replacing the Python-side post-hoc perturbation approach. Applies to all model types (OLS and LME)
+- **`optimistic` scenario** is now a first-class entry in `DEFAULT_SCENARIO_CONFIG` with all-zero perturbation values, eliminating the special `scenario_config=None` code path. Custom scenarios inherit from the optimistic baseline, ensuring all required keys exist
+
+### Fixed
+- **`set_variable_type()` docstring listed wrong distribution types** — documented non-existent `"skewed"` type; now lists all supported types: `right_skewed`, `left_skewed`, `high_kurtosis`, `uniform`
+- **`set_scenario_configs()` docstring referenced non-existent keys** — `"effect_size_jitter"` and `"distribution_jitter"` replaced with actual keys (`correlation_noise_sd`, `distribution_change_prob`, etc.)
+- **String factor levels crash in LME variance computation** — `proportions[level - 1]` crashed when factor levels were strings (e.g. `"Japan"`). Now looks up level position in the label list
+- **Division by zero on constant-variance columns** — `upload_data()` normalization produced `inf`/`NaN` when a column had zero variance. Now raises `ValueError` with the column name
+- **Pending state not cleared after `_apply()`** — calling `_apply()` twice could re-apply the same effects. Pending fields are now reset after each `_apply()` call
+- **Parser crash on unbalanced parentheses** — unmatched `)` caused `paren_count` to go negative, producing silent misparses. Now raises `ValueError`
+- **Update checker wrote cache inside installed package** — moved cache file to `~/.cache/mcpower/update_cache.json`
+- **Update checker unbounded response read** — `response.read()` now limited to 1 MB
+- **`scenario_config` dict access on `None`** — added `None` guards for optional scenario configuration lookups
+- **NaN values in uploaded data** — `upload_data()` now rejects data containing NaN values with a clear error message listing affected columns
+- **Formula minus-sign silently dropped terms** — `y = x1 - x2` silently ignored `x2`. Now raises `ValueError` explaining that term removal with `-` is not supported
+- **`_create_table` crash on empty rows** — formatter now handles empty row lists by computing column widths from headers only
+- **`_create_power_plot` crash when `first_achieved` not in sample sizes** — added bounds check before `.index()` call
+- **Redundant `_validate_cluster_sample_size` call** — removed duplicate validation in `find_power()` (already called per-sample-size in `find_sample_size()`)
+
+### Changed
+- **`upload_data()` returns `self`** for method chaining consistency
+- **Assert statements replaced with `RuntimeError`** — internal assertions now raise proper exceptions instead of using `assert`
+- **Removed "(not yet implemented)" from mixed-model docstrings** — mixed model testing has been implemented since v0.4.2
+- **Thread-safe RNG in data generation** — replaced global `np.random.seed()` with local `np.random.RandomState()` for thread safety
+- **Update checker runs in a background thread** — no longer blocks `import mcpower` on slow networks
+- **Module-level deduplication for update checker** — prevents redundant version checks within the same Python session
+- **Removed unused `cluster_column_indices` parameter** from `_lme_analysis_wrapper()` and `_lme_analysis_statsmodels()` — was explicitly marked unused and kept only for API compatibility
+- **Scenario formatters iterate dynamically** — no longer hardcode scenario names, enabling custom scenario display
+
+### Packaging
+- **`tqdm` added as core dependency** (`>=4.60.0`) — used for progress bars
+- **Removed stale pytest warning filter** for `"Mixed-effects models are experimental"` (warning was removed in v0.5.4)
+- **NumPy minimum version relaxed** to `>=1.26.0` (was `>=2.0.0`) in both build-requires and runtime dependencies
+- **`scikit-build-core` bumped** to `>=0.10` (was `>=0.5`)
+- **`statsmodels` added to `[dev]` extras** for test/development convenience
+- **Documentation URL** now points to the GitHub wiki
+- **Changelog URL** added to project URLs
+- **Removed unused pytest markers** (`unit`, `integration`) — only `lme` marker remains
+- **Per-module mypy overrides** replace blanket `ignore_missing_imports`
+
+### Documentation
+- Updated README requirements section: added `tqdm`, specified `NumPy (>=1.26.0)`
+- Changed `pip install mcpower[all]` → `pip install mcpower[lme]` for statsmodels installation
+- Wiki documentation review and cleanup: fixed broken links, corrected API signatures (`set_scenario_configs` parameter name), removed stale `apply()` and `set_heterogeneity()` wiki pages, fixed formula redundancy in Model Specification, corrected Tukey return value docs, added mixed-model caveats
+
+### Technical
+- Removed ~150 lines of dead scipy fallback shims from `distributions.py`
+- Removed `_BACKEND` sentinel variable (only one backend exists)
+- C++ `generate_y()` now accepts `residual_dist` and `residual_df` parameters for non-normal error generation
+- `suppress_output` test fixture now actually suppresses stdout (was a no-op)
+- Removed unused `correlation_matrix_3x3` test fixture
+- Removed empty `tests/mcpower/` artifact directory
+- Added unit tests for `ResultsProcessor` (`test_results.py`)
+- Added unit tests for `normalize_upload_input` (`test_upload_data_utils.py`)
+- Added integration tests for `test_formula` feature (`test_test_formula.py`)
+- Added unit tests for `test_formula_utils` (`test_test_formula_utils.py`)
+- Rewrote optimizer tests to test native backend directly (removed dead scipy fallback tests)
+
## [0.5.4] - 2026-02-22
### Changed
diff --git a/README.md b/README.md
index a230021..2bd003a 100644
--- a/README.md
+++ b/README.md
@@ -21,6 +21,10 @@
It's a Python package, but prefer a graphical interface? **[MCPower GUI](https://github.com/pawlenartowicz/mcpower-gui)** is a standalone desktop app — no Python installation required. Download ready-to-run executables for Windows, Linux, and macOS from the [releases page](https://github.com/pawlenartowicz/mcpower-gui/releases/latest).
+| Model setup | Results |
+|:---:|:---:|
+|
|
|
+
## Why MCPower?
Traditional power formulas break down with interactions, correlated predictors, categorical variables, or non-normal data. MCPower simulates instead — generates thousands of datasets like yours, fits your model, and counts how often the effects are detected.
@@ -297,19 +301,20 @@ model.set_effects("group[2]=0.4, group[3]=0.6, covariate=0.3")
# Use "vs" syntax for pairwise comparisons + correction="tukey"
model.find_power(
sample_size=150,
- target_test="group[0] vs group[1], group[0] vs group[2]",
+ target_test="group[1] vs group[2], group[1] vs group[3]",
correction="tukey"
)
```
### Test Individual Assumption Violations
```python
-# Manually add specific violations (without full scenario analysis)
-model.set_heterogeneity(0.2) # Effect sizes vary between people
-model.set_heteroskedasticity(0.15) # Violation of equal variance assumption
+# Add specific violations via custom scenario configs
+model.set_scenario_configs({
+ "my_test": {"heterogeneity": 0.2, "heteroskedasticity": 0.15}
+})
-# Run with your manual settings (no automatic scenario variations)
-model.find_sample_size(target_test="treatment")
+# Run with scenario variations
+model.find_sample_size(target_test="treatment", scenarios=True)
```
### Mixed-Effects Models
@@ -392,7 +397,7 @@ model.find_power(sample_size=200, progress_callback=False)
| **Factor effects** | **`model.set_effects("var[2]=0.5, var[3]=0.7")`** |
| Correlated predictors | `model.set_correlations("corr(var1, var2)=0.4")` |
| Multiple testing correction | Add `correction="FDR"`, `"Holm"`, `"Bonferroni"`, or `"Tukey"`|
-| Post-hoc pairwise comparison | `target_test="group[0] vs group[1]"` with `correction="tukey"` |
+| Post-hoc pairwise comparison | `target_test="group[1] vs group[2]"` with `correction="tukey"` |
| Mixed model (random intercept) | `MCPower("y ~ x + (1\|group)")` + `model.set_cluster(...)` |
| Random slopes | `MCPower("y ~ x + (1+x\|group)")` + `set_cluster(..., random_slopes=["x"], slope_variance=0.1)` |
| Nested random effects | `MCPower("y ~ x + (1\|A/B)")` + two `set_cluster()` calls |
@@ -424,7 +429,7 @@ model.find_power(sample_size=200, progress_callback=False)
- For simple models where all assumptions are clearly met.
- For large analyses with tens of thousands of observations, tiny effects, or very low alpha levels.
-## What Makes Scenarios Different? (Be careful, unvalidated, preliminary scenarios)
+## What Makes Scenarios Different? (Rule-of-thumb scenarios)
**Traditional power analysis assumes perfect conditions.** MCPower's scenarios add realistic "messiness":
@@ -478,8 +483,8 @@ model.set_variable_type("treatment=(factor,3), education=(factor,4)")
# Set effects for specific levels
model.set_effects("treatment[2]=0.5, treatment[3]=0.7, education[2]=0.3")
-# Or set same effect for all levels of a factor
-model.set_effects("treatment=0.5") # Applies to treatment[2] and treatment[3]
+# Each non-reference level needs its own effect
+model.set_effects("treatment[2]=0.5, treatment[3]=0.7")
# Important: Factors cannot be used in correlations
# This will error: model.set_correlations("corr(treatment, education)=0.3")
@@ -508,12 +513,31 @@ model.set_alpha(0.01) # Stricter significance (p < 0.01)
model.set_simulations(10000) # High precision (slower)
```
+### Model Misspecification Testing
+
+Use `test_formula` to generate data with one model but test with a simpler one -- useful for evaluating the power impact of omitting variables:
+
+```python
+# Generate with 3 predictors, test with 2 (omitting x3)
+model = MCPower("y = x1 + x2 + x3")
+model.set_effects("x1=0.5, x2=0.3, x3=0.2")
+model.find_power(100, test_formula="y = x1 + x2")
+
+# Generate with clusters, test without (ignoring clustering)
+model = MCPower("y ~ treatment + (1|school)")
+model.set_cluster("school", ICC=0.2, n_clusters=20)
+model.set_effects("treatment=0.5")
+model.find_power(1000, test_formula="y ~ treatment")
+```
+
+See the [Test Formula Tutorial](https://github.com/pawlenartowicz/MCPower/wiki/Tutorial-Test-Formula) for details.
+
### Formula Syntax
```python
# These are equivalent:
-"y = x1 + x2 + x1*x2" # Assignment style
-"y ~ x1 + x2 + x1*x2" # R-style formula
-"x1 + x2 + x1*x2" # Predictors only
+"y = x1 + x2 + x1:x2" # Assignment style
+"y ~ x1 + x2 + x1:x2" # R-style formula
+"x1 + x2 + x1:x2" # Predictors only
# Interactions:
"x1*x2" # Main effects + interaction (x1 + x2 + x1:x2)
@@ -538,9 +562,8 @@ model.set_correlations("(x1, x2)=0.3, (x1, x3)=-0.2")
## Requirements
- Python ≥ 3.10
-- NumPy, matplotlib, joblib
+- NumPy (≥1.26.0), matplotlib, joblib, tqdm
- pandas (optional, for DataFrame input — install with `pip install mcpower[pandas]`)
-- statsmodels (optional, for mixed-effects models — install with `pip install mcpower[all]`)
## Documentation
@@ -549,11 +572,11 @@ Full documentation is available on the **[MCPower Wiki](https://github.com/pawle
- [Quick Start](https://github.com/pawlenartowicz/MCPower/wiki/Quick-Start)
- [Model Specification](https://github.com/pawlenartowicz/MCPower/wiki/Model-Specification)
-- [Variable Types](https://github.com/pawlenartowicz/MCPower/wiki/Variable-Types)
-- [Effect Sizes](https://github.com/pawlenartowicz/MCPower/wiki/Effect-Sizes)
-- [Mixed-Effects Models](https://github.com/pawlenartowicz/MCPower/wiki/Mixed-Effects-Models) (random intercepts, slopes, nested effects)
-- [ANOVA & Post-Hoc Tests](https://github.com/pawlenartowicz/MCPower/wiki/ANOVA-and-Post-Hoc-Tests)
-- [Scenario Analysis](https://github.com/pawlenartowicz/MCPower/wiki/Scenario-Analysis)
+- [Variable Types](https://github.com/pawlenartowicz/MCPower/wiki/Concept-Variable-Types)
+- [Effect Sizes](https://github.com/pawlenartowicz/MCPower/wiki/Concept-Effect-Sizes)
+- [Mixed-Effects Models](https://github.com/pawlenartowicz/MCPower/wiki/Concept-Mixed-Effects) (random intercepts, slopes, nested effects)
+- [ANOVA & Post-Hoc Tests](https://github.com/pawlenartowicz/MCPower/wiki/Tutorial-ANOVA-PostHoc)
+- [Scenario Analysis](https://github.com/pawlenartowicz/MCPower/wiki/Concept-Scenario-Analysis)
- [API Reference](https://github.com/pawlenartowicz/MCPower/wiki/API-Reference)
## Need Help?
@@ -568,8 +591,8 @@ Full documentation is available on the **[MCPower Wiki](https://github.com/pawle
- ✅ C++ native backend (pybind11 + Eigen, 3x speedup)
- ✅ Mixed Effects Models (random intercepts, random slopes, nested effects) — [validated against lme4](https://github.com/pawlenartowicz/MCPower/wiki/Concept-LME-Validation)
- 🚧 Logistic Regression (coming soon)
-- 🚧 ANOVA (coming soon)
-- 🚧 Guide about methods, corrections (coming soon)
+- ✅ ANOVA (factor variables as ANOVA, post-hoc pairwise comparisons)
+- ✅ Guide about methods, corrections
- 📋 2 groups comparison with alternative tests
- 📋 Robust regression methods
@@ -578,16 +601,18 @@ Full documentation is available on the **[MCPower Wiki](https://github.com/pawle
GPL v3. If you use MCPower in research, please cite:
-Lenartowicz, P. (2025). MCPower: Monte Carlo Power Analysis for Statistical Models. Zenodo. DOI: 10.5281/zenodo.16502734
+Lenartowicz, P. (2025). MCPower: Monte Carlo Power Analysis for Complex Statistical Models (Version ) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.16502734
+
+*Replace `` with the version you used — check with `import mcpower; print(mcpower.__version__)`.*
```bibtex
@software{mcpower2025,
- author = {Pawel Lenartowicz},
- title = {MCPower: Monte Carlo Power Analysis for Statistical Models},
- year = {2025},
+ author = {Lenartowicz, Pawe{\l}},
+ title = {{MCPower}: Monte Carlo Power Analysis for Complex Statistical Models},
+ year = {2025},
publisher = {Zenodo},
- doi = {10.5281/zenodo.16502734},
- url = {https://doi.org/10.5281/zenodo.16502734}
+ doi = {10.5281/zenodo.16502734},
+ url = {https://doi.org/10.5281/zenodo.16502734}
}
```
diff --git a/cpp/src/bindings.cpp b/cpp/src/bindings.cpp
index 26fee22..8c02998 100644
--- a/cpp/src/bindings.cpp
+++ b/cpp/src/bindings.cpp
@@ -110,7 +110,9 @@ py::array_t generate_y_wrapper(
py::array_t effects,
double heterogeneity,
double heteroskedasticity,
- int seed
+ int seed,
+ int residual_dist,
+ double residual_df
) {
auto X_buf = X.request();
auto effects_buf = effects.request();
@@ -129,7 +131,8 @@ py::array_t generate_y_wrapper(
);
Eigen::VectorXd y = generate_y(
- X_map, effects_map, heterogeneity, heteroskedasticity, seed
+ X_map, effects_map, heterogeneity, heteroskedasticity, seed,
+ residual_dist, residual_df
);
py::array_t result(n);
@@ -447,7 +450,9 @@ PYBIND11_MODULE(mcpower_native, m) {
py::arg("heterogeneity") = 0.0,
py::arg("heteroskedasticity") = 0.0,
py::arg("seed") = -1,
- "Generate dependent variable with heterogeneity and heteroskedasticity"
+ py::arg("residual_dist") = 0,
+ py::arg("residual_df") = 10.0,
+ "Generate dependent variable with heterogeneity, heteroskedasticity, and non-normal residuals"
);
// LME analysis (q=1 random intercept)
diff --git a/cpp/src/ols.cpp b/cpp/src/ols.cpp
index 7d04ec2..11bd62f 100644
--- a/cpp/src/ols.cpp
+++ b/cpp/src/ols.cpp
@@ -151,19 +151,14 @@ Eigen::VectorXd generate_y(
const Eigen::Ref& effects,
double heterogeneity,
double heteroskedasticity,
- int seed
+ int seed,
+ int residual_dist,
+ double residual_df
) {
const int n = static_cast(X.rows());
const int p = static_cast(X.cols());
- // Set up random generator
std::mt19937 gen;
- if (seed >= 0) {
- gen.seed(static_cast(seed));
- } else {
- std::random_device rd;
- gen.seed(rd());
- }
std::normal_distribution normal(0.0, 1.0);
// Linear predictor with heterogeneity
@@ -176,9 +171,12 @@ Eigen::VectorXd generate_y(
// Heterogeneity: vary effect sizes per observation
linear_pred.setZero();
- // Change seed for heterogeneity noise
+ // Seed at offset +1 for heterogeneity noise
if (seed >= 0) {
gen.seed(static_cast(seed + 1));
+ } else {
+ std::random_device rd;
+ gen.seed(rd());
}
for (int j = 0; j < p; ++j) {
@@ -192,14 +190,43 @@ Eigen::VectorXd generate_y(
}
}
- // Generate errors
+ // Generate errors — seed at offset +2
if (seed >= 0) {
gen.seed(static_cast(seed + 2));
+ } else {
+ std::random_device rd;
+ gen.seed(rd());
}
Eigen::VectorXd error(n);
- for (int i = 0; i < n; ++i) {
- error(i) = normal(gen);
+
+ if (residual_dist == 1) {
+ // Heavy-tailed: Student's t distribution
+ double df = std::max(residual_df, 3.0);
+ std::student_t_distribution t_dist(df);
+ double theoretical_scale = 1.0 / std::sqrt(df / (df - 2.0));
+ for (int i = 0; i < n; ++i) {
+ error(i) = t_dist(gen) * theoretical_scale;
+ }
+ } else if (residual_dist == 2) {
+ // Skewed: chi-squared, centered and scaled
+ double df = std::max(residual_df, 3.0);
+ std::chi_squared_distribution chi2_dist(df);
+ double scale = 1.0 / std::sqrt(2.0 * df);
+ for (int i = 0; i < n; ++i) {
+ error(i) = (chi2_dist(gen) - df) * scale;
+ }
+ } else {
+ // Normal (default)
+ for (int i = 0; i < n; ++i) {
+ error(i) = normal(gen);
+ }
+ }
+
+ // Empirical re-standardization to SD = 1
+ double empirical_sd = std::sqrt(error.array().square().mean());
+ if (empirical_sd > FLOAT_NEAR_ZERO) {
+ error /= empirical_sd;
}
// Apply heteroskedasticity
diff --git a/cpp/src/ols.hpp b/cpp/src/ols.hpp
index ad1f9b6..9e046eb 100644
--- a/cpp/src/ols.hpp
+++ b/cpp/src/ols.hpp
@@ -65,6 +65,8 @@ class OLSAnalyzer {
* @param heterogeneity SD of effect size variation
* @param heteroskedasticity Correlation between predictor and error variance
* @param seed Random seed (-1 for random)
+ * @param residual_dist Error distribution: 0=normal, 1=heavy_tailed (t), 2=skewed (chi2)
+ * @param residual_df Degrees of freedom for non-normal residuals (min clamped to 3)
* @return Response vector (n_samples,)
*/
Eigen::VectorXd generate_y(
@@ -72,7 +74,9 @@ Eigen::VectorXd generate_y(
const Eigen::Ref& effects,
double heterogeneity,
double heteroskedasticity,
- int seed
+ int seed,
+ int residual_dist = 0,
+ double residual_df = 10.0
);
} // namespace mcpower
diff --git a/docs/screenshots/gui-model-setup.png b/docs/screenshots/gui-model-setup.png
new file mode 100644
index 0000000..7f87a53
Binary files /dev/null and b/docs/screenshots/gui-model-setup.png differ
diff --git a/docs/screenshots/gui-results.png b/docs/screenshots/gui-results.png
new file mode 100644
index 0000000..f84152d
Binary files /dev/null and b/docs/screenshots/gui-results.png differ
diff --git a/mcpower/__init__.py b/mcpower/__init__.py
index a52560c..675a4f3 100644
--- a/mcpower/__init__.py
+++ b/mcpower/__init__.py
@@ -16,7 +16,6 @@
from importlib.metadata import version as _get_version
-from .backends import get_backend_info, set_backend
from .model import MCPower
from .progress import PrintReporter, ProgressReporter, SimulationCancelled, TqdmReporter
@@ -27,14 +26,14 @@
__all__ = [
"MCPower",
"SimulationCancelled",
- "set_backend",
- "get_backend_info",
"ProgressReporter",
"PrintReporter",
"TqdmReporter",
]
+import threading as _threading
+
from .utils.updates import _check_for_updates
-_check_for_updates(__version__)
+_threading.Thread(target=_check_for_updates, args=(__version__,), daemon=True).start()
diff --git a/mcpower/backends/__init__.py b/mcpower/backends/__init__.py
index 7bb03f8..8b24a73 100644
--- a/mcpower/backends/__init__.py
+++ b/mcpower/backends/__init__.py
@@ -3,11 +3,9 @@
This module provides a unified interface for compute backends.
The only supported backend is native C++ (compiled via pybind11).
-
-Users can override via set_backend('c++' | 'default') or pass a ComputeBackend instance.
"""
-from typing import Optional, Protocol, Union, runtime_checkable
+from typing import Optional, Protocol, runtime_checkable
import numpy as np
@@ -24,6 +22,7 @@ def ols_analysis(
f_crit: float,
t_crit: float,
correction_t_crits: np.ndarray,
+ # correction_method encoding: 0=none, 1=Bonferroni, 2=FDR (BH), 3=Holm
correction_method: int,
) -> np.ndarray:
"""Run OLS regression and return significance flags.
@@ -40,9 +39,15 @@ def generate_y(
heterogeneity: float,
heteroskedasticity: float,
seed: int,
+ residual_dist: int = 0,
+ residual_df: float = 10.0,
) -> np.ndarray:
"""Generate the dependent variable ``y = X @ effects + error``.
+ Args:
+ residual_dist: Error distribution (0=normal, 1=heavy_tailed, 2=skewed).
+ residual_df: Degrees of freedom for non-normal residuals.
+
Returns:
1-D array of length ``n_samples``.
"""
@@ -88,12 +93,8 @@ def lme_analysis(
...
-# Valid backend names for set_backend()
-_BACKEND_NAMES = {"default", "c++"}
-
# Global backend instance
_backend_instance: Optional[ComputeBackend] = None
-_backend_forced = False
def get_backend() -> ComputeBackend:
@@ -101,7 +102,7 @@ def get_backend() -> ComputeBackend:
Get the active compute backend.
On first call, instantiates the C++ native backend.
- Subsequent calls return the cached instance unless reset_backend() is called.
+ Subsequent calls return the cached instance.
Raises:
ImportError: If the C++ extension is not compiled/installed.
@@ -117,64 +118,7 @@ def get_backend() -> ComputeBackend:
return _backend_instance
-def set_backend(backend: Union[str, ComputeBackend]) -> None:
- """
- Set the compute backend.
-
- Args:
- backend: One of:
- - 'default' -- use native C++ backend
- - 'c++' -- force native C++ backend
- - A ComputeBackend instance
-
- Raises:
- ImportError: If the C++ backend is not available.
- ValueError: If the string is not recognized.
- """
- global _backend_instance, _backend_forced
-
- if isinstance(backend, str):
- name = backend.lower().strip()
- if name not in _BACKEND_NAMES:
- raise ValueError(f"Unknown backend {backend!r}. Choose from: {', '.join(sorted(_BACKEND_NAMES))}")
-
- from .native import NativeBackend
-
- _backend_instance = NativeBackend()
- _backend_forced = name != "default"
- else:
- _backend_instance = backend
- _backend_forced = True
-
-
-def reset_backend() -> None:
- """Reset backend to automatic selection."""
- global _backend_instance, _backend_forced
- _backend_instance = None
- _backend_forced = False
-
-
-def get_backend_info() -> dict:
- """
- Get information about the current backend.
-
- Returns:
- Dictionary with backend name, type, and whether it was forced.
- """
- backend = get_backend()
- name = type(backend).__name__
- return {
- "name": name,
- "is_native": name == "NativeBackend",
- "module": type(backend).__module__,
- "forced": _backend_forced,
- }
-
-
__all__ = [
"ComputeBackend",
"get_backend",
- "set_backend",
- "reset_backend",
- "get_backend_info",
]
diff --git a/mcpower/backends/native.py b/mcpower/backends/native.py
index acd7633..2338cfe 100644
--- a/mcpower/backends/native.py
+++ b/mcpower/backends/native.py
@@ -17,6 +17,11 @@
mcpower_native = None
+def _prep(arr: np.ndarray, dtype=np.float64) -> np.ndarray:
+ """Ensure array is contiguous with the expected dtype for C++ interop."""
+ return np.ascontiguousarray(arr, dtype=dtype)
+
+
class NativeBackend:
"""
C++ compute backend using pybind11 bindings.
@@ -46,8 +51,8 @@ def _initialize_tables(self) -> None:
t3_ppf = manager.load_t3_ppf_table()
# Ensure correct dtypes
- norm_cdf = np.ascontiguousarray(norm_cdf.astype(np.float64))
- t3_ppf = np.ascontiguousarray(t3_ppf.astype(np.float64))
+ norm_cdf = _prep(norm_cdf)
+ t3_ppf = _prep(t3_ppf)
# Initialize C++ tables (generation tables only)
mcpower_native.init_tables(norm_cdf, t3_ppf)
@@ -77,10 +82,10 @@ def ols_analysis(
Returns:
Array: [f_sig, uncorrected..., corrected...]
"""
- X = np.ascontiguousarray(X, dtype=np.float64)
- y = np.ascontiguousarray(y, dtype=np.float64)
- target_indices = np.ascontiguousarray(target_indices, dtype=np.int32)
- correction_t_crits = np.ascontiguousarray(correction_t_crits, dtype=np.float64)
+ X = _prep(X)
+ y = _prep(y)
+ target_indices = _prep(target_indices, np.int32)
+ correction_t_crits = _prep(correction_t_crits)
return mcpower_native.ols_analysis(X, y, target_indices, f_crit, t_crit, correction_t_crits, correction_method) # type: ignore[no-any-return]
@@ -91,6 +96,8 @@ def generate_y(
heterogeneity: float,
heteroskedasticity: float,
seed: int,
+ residual_dist: int = 0,
+ residual_df: float = 10.0,
) -> np.ndarray:
"""
Generate dependent variable.
@@ -101,14 +108,16 @@ def generate_y(
heterogeneity: Effect size variation SD
heteroskedasticity: Error-predictor correlation
seed: Random seed (-1 for random)
+ residual_dist: Error distribution (0=normal, 1=heavy_tailed, 2=skewed)
+ residual_df: Degrees of freedom for non-normal residuals
Returns:
Response vector (n_samples,)
"""
- X = np.ascontiguousarray(X, dtype=np.float64)
- effects = np.ascontiguousarray(effects, dtype=np.float64)
+ X = _prep(X)
+ effects = _prep(effects)
- return mcpower_native.generate_y(X, effects, heterogeneity, heteroskedasticity, seed) # type: ignore[no-any-return]
+ return mcpower_native.generate_y(X, effects, heterogeneity, heteroskedasticity, seed, residual_dist, residual_df) # type: ignore[no-any-return]
def generate_X(
self,
@@ -137,11 +146,11 @@ def generate_X(
Returns:
Design matrix (n_samples, n_vars)
"""
- correlation_matrix = np.ascontiguousarray(correlation_matrix, dtype=np.float64)
- var_types = np.ascontiguousarray(var_types, dtype=np.int32)
- var_params = np.ascontiguousarray(var_params, dtype=np.float64)
- upload_normal = np.ascontiguousarray(upload_normal, dtype=np.float64)
- upload_data = np.ascontiguousarray(upload_data, dtype=np.float64)
+ correlation_matrix = _prep(correlation_matrix)
+ var_types = _prep(var_types, np.int32)
+ var_params = _prep(var_params)
+ upload_normal = _prep(upload_normal)
+ upload_data = _prep(upload_data)
return mcpower_native.generate_X( # type: ignore[no-any-return]
n_samples,
@@ -185,12 +194,15 @@ def lme_analysis(
Returns:
Array: [f_sig, uncorrected..., corrected..., wald_flag]
or empty array on failure
+
+ wald_flag: 1.0 if the Wald test was used as fallback for the overall
+ significance test (instead of the likelihood ratio test), 0.0 otherwise.
"""
- X = np.ascontiguousarray(X, dtype=np.float64)
- y = np.ascontiguousarray(y, dtype=np.float64)
- cluster_ids = np.ascontiguousarray(cluster_ids, dtype=np.int32)
- target_indices = np.ascontiguousarray(target_indices, dtype=np.int32)
- correction_z_crits = np.ascontiguousarray(correction_z_crits, dtype=np.float64)
+ X = _prep(X)
+ y = _prep(y)
+ cluster_ids = _prep(cluster_ids, np.int32)
+ target_indices = _prep(target_indices, np.int32)
+ correction_z_crits = _prep(correction_z_crits)
return mcpower_native.lme_analysis( # type: ignore[no-any-return]
X,
@@ -240,14 +252,17 @@ def lme_analysis_general(
Returns:
Array: [f_sig, uncorrected..., corrected..., wald_flag]
or empty array on failure
+
+ wald_flag: 1.0 if the Wald test was used as fallback for the overall
+ significance test (instead of the likelihood ratio test), 0.0 otherwise.
"""
- X = np.ascontiguousarray(X, dtype=np.float64)
- y = np.ascontiguousarray(y, dtype=np.float64)
- Z = np.ascontiguousarray(Z, dtype=np.float64)
- cluster_ids = np.ascontiguousarray(cluster_ids, dtype=np.int32)
- target_indices = np.ascontiguousarray(target_indices, dtype=np.int32)
- correction_z_crits = np.ascontiguousarray(correction_z_crits, dtype=np.float64)
- warm_theta = np.ascontiguousarray(warm_theta, dtype=np.float64)
+ X = _prep(X)
+ y = _prep(y)
+ Z = _prep(Z)
+ cluster_ids = _prep(cluster_ids, np.int32)
+ target_indices = _prep(target_indices, np.int32)
+ correction_z_crits = _prep(correction_z_crits)
+ warm_theta = _prep(warm_theta)
return mcpower_native.lme_analysis_general( # type: ignore[no-any-return]
X,
@@ -301,15 +316,18 @@ def lme_analysis_nested(
Returns:
Array: [f_sig, uncorrected..., corrected..., wald_flag]
or empty array on failure
+
+ wald_flag: 1.0 if the Wald test was used as fallback for the overall
+ significance test (instead of the likelihood ratio test), 0.0 otherwise.
"""
- X = np.ascontiguousarray(X, dtype=np.float64)
- y = np.ascontiguousarray(y, dtype=np.float64)
- parent_ids = np.ascontiguousarray(parent_ids, dtype=np.int32)
- child_ids = np.ascontiguousarray(child_ids, dtype=np.int32)
- child_to_parent = np.ascontiguousarray(child_to_parent, dtype=np.int32)
- target_indices = np.ascontiguousarray(target_indices, dtype=np.int32)
- correction_z_crits = np.ascontiguousarray(correction_z_crits, dtype=np.float64)
- warm_theta = np.ascontiguousarray(warm_theta, dtype=np.float64)
+ X = _prep(X)
+ y = _prep(y)
+ parent_ids = _prep(parent_ids, np.int32)
+ child_ids = _prep(child_ids, np.int32)
+ child_to_parent = _prep(child_to_parent, np.int32)
+ target_indices = _prep(target_indices, np.int32)
+ correction_z_crits = _prep(correction_z_crits)
+ warm_theta = _prep(warm_theta)
return mcpower_native.lme_analysis_nested( # type: ignore[no-any-return]
X,
diff --git a/mcpower/core/results.py b/mcpower/core/results.py
index 45b178a..dfe6b77 100644
--- a/mcpower/core/results.py
+++ b/mcpower/core/results.py
@@ -54,6 +54,7 @@ def calculate_powers(
# Individual powers
individual_powers = {}
individual_powers_corrected = {}
+ non_overall_tests = [t for t in target_tests if t != "overall"]
for test in target_tests:
if test == "overall":
@@ -62,7 +63,6 @@ def calculate_powers(
individual_powers_corrected[test] = np.mean(results_corrected_array[:, 0]) * 100
else:
# Find position among non-'overall' tests and add 1 for F-test offset
- non_overall_tests = [t for t in target_tests if t != "overall"]
pos = non_overall_tests.index(test)
col_idx = pos + 1 # +1 because column 0 is F-test
individual_powers[test] = np.mean(results_array[:, col_idx]) * 100
diff --git a/mcpower/core/scenarios.py b/mcpower/core/scenarios.py
index 454f8e3..2d2dd01 100644
--- a/mcpower/core/scenarios.py
+++ b/mcpower/core/scenarios.py
@@ -13,35 +13,51 @@
from ..utils.visualization import _create_power_plot
# Default scenario configurations.
+# "optimistic" is the zero-perturbation baseline — also used as the default
+# scenario_config when scenarios=False and as a template for custom scenarios
+# (ensures all required keys exist).
# "realistic" introduces moderate assumption violations; "doomer" introduces
# severe violations. Each simulation iteration draws random perturbations
# from these parameters (correlation noise, distribution swaps, etc.).
DEFAULT_SCENARIO_CONFIG = {
+ "optimistic": {
+ "heterogeneity": 0.0,
+ "heteroskedasticity": 0.0,
+ "correlation_noise_sd": 0.0,
+ "distribution_change_prob": 0.0,
+ "new_distributions": ["right_skewed", "left_skewed", "uniform"],
+ # Mixed model perturbations (only consumed when cluster_specs present)
+ "random_effect_dist": "normal",
+ "random_effect_df": 5,
+ "icc_noise_sd": 0.0,
+ # Residual distribution perturbations (all model types)
+ "residual_dists": ["heavy_tailed", "skewed"],
+ "residual_change_prob": 0.0,
+ "residual_df": 10,
+ },
"realistic": {
"heterogeneity": 0.2,
- "heteroskedasticity": 0.1,
- "correlation_noise_sd": 0.2,
- "distribution_change_prob": 0.3,
+ "heteroskedasticity": 0.15,
+ "correlation_noise_sd": 0.15,
+ "distribution_change_prob": 0.5,
"new_distributions": ["right_skewed", "left_skewed", "uniform"],
- # LME-specific keys (only consumed when cluster_specs present)
"random_effect_dist": "heavy_tailed",
- "random_effect_df": 5,
+ "random_effect_df": 10,
"icc_noise_sd": 0.15,
- "residual_dist": "heavy_tailed",
- "residual_change_prob": 0.3,
- "residual_df": 10,
+ "residual_dists": ["heavy_tailed", "skewed"],
+ "residual_change_prob": 0.5,
+ "residual_df": 8,
},
"doomer": {
"heterogeneity": 0.4,
- "heteroskedasticity": 0.2,
- "correlation_noise_sd": 0.4,
- "distribution_change_prob": 0.6,
+ "heteroskedasticity": 0.35,
+ "correlation_noise_sd": 0.30,
+ "distribution_change_prob": 0.8,
"new_distributions": ["right_skewed", "left_skewed", "uniform"],
- # LME-specific keys (only consumed when cluster_specs present)
"random_effect_dist": "heavy_tailed",
- "random_effect_df": 3,
+ "random_effect_df": 5,
"icc_noise_sd": 0.30,
- "residual_dist": "heavy_tailed",
+ "residual_dists": ["heavy_tailed", "skewed"],
"residual_change_prob": 0.8,
"residual_df": 5,
},
@@ -111,15 +127,7 @@ def run_power_analysis(
if progress is not None:
progress.start()
- # Optimistic (user's original settings)
- results["optimistic"] = run_find_power_func(
- sample_size=sample_size,
- target_tests=target_tests,
- correction=correction,
- scenario_config=None,
- )
-
- # Realistic & Doomer scenarios
+ # Run all scenarios (optimistic is always present as zero-perturbation baseline)
for scenario_name, config in self.configs.items():
results[scenario_name] = run_find_power_func(
sample_size=sample_size,
@@ -175,15 +183,7 @@ def run_sample_size_analysis(
if progress is not None:
progress.start()
- # Optimistic
- results["optimistic"] = run_sample_size_func(
- sample_sizes=sample_sizes,
- target_tests=target_tests,
- correction=correction,
- scenario_config=None,
- )
-
- # Other scenarios
+ # Run all scenarios (optimistic is always present as zero-perturbation baseline)
for scenario_name, config in self.configs.items():
results[scenario_name] = run_sample_size_func(
sample_sizes=sample_sizes,
@@ -209,8 +209,9 @@ def run_sample_size_analysis(
def _create_scenario_plots(self, results: Dict) -> None:
"""Create visualizations for scenario analysis."""
scenarios = results["scenarios"]
- scenario_names = ["optimistic", "realistic", "doomer"]
- scenario_labels = ["Optimistic", "Realistic", "Doomer"]
+ # Derive scenario order from results: optimistic first, then config keys
+ scenario_names = ["optimistic"] + [k for k in scenarios if k != "optimistic"]
+ scenario_labels = [name.title() for name in scenario_names]
first_scenario = scenarios.get("optimistic", {})
if "results" not in first_scenario or "sample_sizes_tested" not in first_scenario["results"]:
@@ -286,7 +287,7 @@ def apply_lme_perturbations(
if icc_noise_sd == 0.0 and re_dist == "normal":
return None
- rng = np.random.RandomState(sim_seed + 5000 if sim_seed is not None else None)
+ rng = np.random.RandomState(sim_seed + 6 if sim_seed is not None else None)
# ICC jitter: multiplicative noise on tau_squared per grouping variable
tau_squared_multipliers: Dict[str, float] = {}
@@ -304,70 +305,6 @@ def apply_lme_perturbations(
}
-def apply_lme_residual_perturbations(
- y: np.ndarray,
- scenario_config: Dict,
- sim_seed: Optional[int],
-) -> np.ndarray:
- """Replace normal residuals with non-normal if coin flip succeeds.
-
- For each simulation, independently flips a coin (probability
- ``residual_change_prob``) to decide whether residuals are replaced.
- If activated, reproduces the original N(0,1) errors via the known
- seed, generates replacements from t(df) or shifted χ², and applies
- the correction ``y += (new_error - original_error)``.
-
- Args:
- y: Dependent variable array (modified in-place).
- scenario_config: Scenario parameters with residual keys.
- sim_seed: Random seed for reproducibility.
-
- Returns:
- The (possibly modified) dependent variable array.
- """
- residual_dist = scenario_config.get("residual_dist", "normal")
- residual_change_prob = scenario_config.get("residual_change_prob", 0.0)
- residual_df = scenario_config.get("residual_df", 10)
-
- if residual_dist == "normal" or residual_change_prob <= 0.0:
- return y
-
- rng = np.random.RandomState(sim_seed + 6000 if sim_seed is not None else None)
-
- # Coin flip: should this simulation have non-normal residuals?
- if rng.random() > residual_change_prob:
- return y
-
- n = len(y)
-
- # Reproduce the original N(0,1) errors using the same seed as generate_y
- # generate_y uses sim_seed + 2 for error generation
- original_rng = np.random.RandomState(sim_seed + 2 if sim_seed is not None else None)
- original_errors = original_rng.standard_normal(n)
-
- # Generate replacement errors
- replacement_rng = np.random.RandomState(sim_seed + 6001 if sim_seed is not None else None)
-
- if residual_dist == "heavy_tailed":
- # t(df) scaled to have variance 1
- df = max(residual_df, 3)
- raw = replacement_rng.standard_t(df, size=n)
- # t(df) has variance df/(df-2), scale to unit variance
- scale = 1.0 / np.sqrt(df / (df - 2))
- new_errors = raw * scale
- elif residual_dist == "skewed":
- # Shifted chi-squared: mean=0, variance=1
- df = max(residual_df, 3)
- raw = replacement_rng.chisquare(df, size=n)
- new_errors = (raw - df) / np.sqrt(2 * df)
- else:
- return y
-
- # Apply correction: swap out original errors for new ones
- y = y + (new_errors - original_errors)
- return y
-
-
def apply_per_simulation_perturbations(
correlation_matrix: np.ndarray,
var_types: np.ndarray,
@@ -393,19 +330,22 @@ def apply_per_simulation_perturbations(
if scenario_config is None:
return correlation_matrix, var_types
- rng = np.random.RandomState(sim_seed)
+ rng = np.random.RandomState(sim_seed + 5 if sim_seed is not None else None)
# Perturb correlation matrix
perturbed_corr = correlation_matrix
- if correlation_matrix is not None and scenario_config["correlation_noise_sd"] > 0:
+ if correlation_matrix is not None and scenario_config.get("correlation_noise_sd", 0) > 0:
perturbed_corr = correlation_matrix.copy()
noise = rng.normal(0, scenario_config["correlation_noise_sd"], correlation_matrix.shape)
noise = (noise + noise.T) / 2 # Keep symmetric
perturbed_corr += noise
+ # Clip off-diagonal correlations to [-0.8, 0.8] to prevent near-singular
+ # matrices that cause Cholesky decomposition failures in data generation.
perturbed_corr = np.clip(perturbed_corr, -0.8, 0.8)
np.fill_diagonal(perturbed_corr, 1.0)
- # Ensure positive semi-definiteness via eigenvalue clipping
+ # Nearest correlation matrix repair via spectral clipping: set negative
+ # eigenvalues to zero and reconstruct, then re-normalize to unit diagonal.
eigvals, eigvecs = np.linalg.eigh(perturbed_corr)
if np.any(eigvals < 0):
eigvals = np.maximum(eigvals, 0.0)
@@ -417,7 +357,7 @@ def apply_per_simulation_perturbations(
# Perturb variable types
perturbed_var_types = var_types.copy()
- if scenario_config["distribution_change_prob"] > 0:
+ if scenario_config.get("distribution_change_prob", 0) > 0:
type_mapping = {"right_skewed": 2, "left_skewed": 3, "uniform": 5}
new_type_codes = [type_mapping[distribution] for distribution in scenario_config["new_distributions"]]
diff --git a/mcpower/core/simulation.py b/mcpower/core/simulation.py
index 266a39e..2223324 100644
--- a/mcpower/core/simulation.py
+++ b/mcpower/core/simulation.py
@@ -61,7 +61,7 @@ def __init__(
Args:
n_simulations: Number of Monte Carlo iterations.
seed: Base random seed. Each iteration uses
- ``seed + 4 * sim_id``.
+ ``seed + 12 * sim_id``.
alpha: Significance level for hypothesis tests.
parallel: Parallel processing mode (unused inside the
runner itself; parallelism is handled at the
@@ -143,12 +143,19 @@ def run_power_simulations(
if metadata.cluster_specs:
from ..stats.lme_solver import compute_lme_critical_values
- n_fixed = len(metadata.target_indices)
- # n_fixed_effects = number of columns in X_expanded (excluding intercept)
- # This equals the total effect count minus cluster effects
- n_fixed_total = len(metadata.effect_sizes)
- if metadata.cluster_effect_indices:
- n_fixed_total -= len(metadata.cluster_effect_indices)
+ # Use test formula dimensions when subsetting with random effects
+ if metadata.test_column_indices is not None and metadata.test_has_random_effects:
+ if metadata.test_target_indices is None:
+ raise RuntimeError("test_target_indices must be set when test_column_indices is present")
+ n_fixed = len(metadata.test_target_indices)
+ n_fixed_total = metadata.test_effect_count
+ else:
+ n_fixed = len(metadata.target_indices)
+ # n_fixed_effects = number of columns in X_expanded (excluding intercept)
+ # This equals the total effect count minus cluster effects
+ n_fixed_total = len(metadata.effect_sizes)
+ if metadata.cluster_effect_indices:
+ n_fixed_total -= len(metadata.cluster_effect_indices)
chi2_crit, z_crit, correction_z_crits = compute_lme_critical_values(
self.alpha, n_fixed_total, n_fixed, metadata.correction_method
)
@@ -162,19 +169,18 @@ def run_power_simulations(
raise SimulationCancelled("Simulation cancelled by user")
- sim_seed = self.seed + 4 * sim_id if self.seed is not None else None
-
- # Apply perturbations if in scenario mode
- if scenario_config is not None and apply_perturbations_func is not None:
- perturbed_corr, perturbed_types = apply_perturbations_func(
- metadata.correlation_matrix,
- metadata.var_types,
- scenario_config,
- sim_seed,
- )
- else:
- perturbed_corr = metadata.correlation_matrix
- perturbed_types = metadata.var_types
+ sim_seed = self.seed + 12 * sim_id if self.seed is not None else None
+
+ # Apply per-simulation perturbations (correlation noise, distribution swaps)
+ # Zero-valued params in optimistic scenario are no-ops
+ if apply_perturbations_func is None:
+ raise RuntimeError("apply_perturbations_func must be provided")
+ perturbed_corr, perturbed_types = apply_perturbations_func(
+ metadata.correlation_matrix,
+ metadata.var_types,
+ scenario_config,
+ sim_seed,
+ )
result = self._single_simulation(
sim_id=sim_id,
@@ -326,7 +332,9 @@ def _single_simulation(
first_spec = next(iter(metadata.cluster_specs.values()))
sample_size = first_spec["n_clusters"] * first_spec["cluster_size"]
- # Check if strict mode with uploaded data
+ # Strict-mode bootstrap: resample whole rows from uploaded data to
+ # preserve exact inter-variable relationships, then generate y from
+ # the bootstrapped X. This bypasses the normal X-generation pipeline.
if metadata.preserve_correlation == "strict" and metadata.uploaded_raw_data is not None:
# Strict mode: bootstrap uploaded data + generate created variables separately
from ..stats.data_generation import bootstrap_uploaded_data
@@ -336,7 +344,7 @@ def _single_simulation(
sample_size,
metadata.uploaded_raw_data,
metadata.uploaded_var_metadata,
- sim_seed,
+ sim_seed + 3 if sim_seed is not None else None,
)
# Merge uploaded and created non-factor variables
@@ -367,7 +375,7 @@ def _single_simulation(
X_factors = X_uploaded_factors
else:
# Mixed: generate all factors, replace uploaded factor columns
- X_factors = _generate_factors(sample_size, metadata.factor_specs, sim_seed)
+ X_factors = _generate_factors(sample_size, metadata.factor_specs, sim_seed + 3 if sim_seed is not None else None)
# Overwrite uploaded factor dummy columns with bootstrapped data
if X_uploaded_factors.shape[1] > 0:
col_offset = 0
@@ -400,14 +408,14 @@ def _single_simulation(
X_non_factors = np.empty((sample_size, 0), dtype=float)
# Generate factor variables (as dummy variables)
- X_factors = _generate_factors(sample_size, metadata.factor_specs, sim_seed)
+ X_factors = _generate_factors(sample_size, metadata.factor_specs, sim_seed + 3 if sim_seed is not None else None)
# Compute LME perturbations (ICC jitter, non-normal RE dist)
lme_perturbations = None
- if metadata.cluster_specs and scenario_config is not None:
+ if metadata.cluster_specs:
from ..core.scenarios import apply_lme_perturbations
- lme_perturbations = apply_lme_perturbations(metadata.cluster_specs, scenario_config, sim_seed)
+ lme_perturbations = apply_lme_perturbations(metadata.cluster_specs, scenario_config or {}, sim_seed)
# Generate cluster random effects (independent of upload mode)
re_result = None # Phase 2: random effects result for slopes/nesting
@@ -448,6 +456,16 @@ def _single_simulation(
# Create extended design matrix with interactions (excludes cluster effects)
X_expanded = create_X_extended_func(X)
+ # Test formula column subsetting: use reduced design matrix for analysis
+ if metadata.test_column_indices is not None:
+ X_test = X_expanded[:, metadata.test_column_indices]
+ if metadata.test_target_indices is None:
+ raise RuntimeError("test_target_indices must be set when test_column_indices is present")
+ test_target_indices = metadata.test_target_indices
+ else:
+ X_test = X_expanded
+ test_target_indices = metadata.target_indices
+
# Split effect sizes: fixed effects vs cluster effects
# Use precomputed values (Phase 2 optimization)
if metadata.cluster_effect_indices:
@@ -457,6 +475,21 @@ def _single_simulation(
fixed_effect_sizes = metadata.fixed_effect_sizes_cached
cluster_effect_sizes = None
+ # Residual coin flip: decide whether this simulation uses non-normal errors
+ residual_dist = 0 # normal
+ residual_df = 10.0
+ residual_change_prob = scenario_config.get("residual_change_prob", 0.0) if scenario_config else 0.0
+ if residual_change_prob > 0:
+ if scenario_config is None:
+ raise RuntimeError("scenario_config must be provided when residual_change_prob > 0")
+ coin_rng = np.random.RandomState(sim_seed + 7 if sim_seed is not None else None)
+ if coin_rng.random() < residual_change_prob:
+ residual_dists = scenario_config.get("residual_dists", ["heavy_tailed", "skewed"])
+ picked = coin_rng.choice(residual_dists)
+ dist_map = {"heavy_tailed": 1, "skewed": 2}
+ residual_dist = dist_map.get(picked, 0)
+ residual_df = float(scenario_config.get("residual_df", 10))
+
# Generate dependent variable with fixed effects only
y = generate_y_func(
X_expanded=X_expanded,
@@ -464,6 +497,8 @@ def _single_simulation(
heterogeneity=metadata.heterogeneity,
heteroskedasticity=metadata.heteroskedasticity,
sim_seed=sim_seed,
+ residual_dist=residual_dist,
+ residual_df=residual_df,
)
# Add cluster random effects contribution
@@ -478,12 +513,6 @@ def _single_simulation(
if re_result is not None and not np.allclose(re_result.slope_contribution, 0):
y = y + re_result.slope_contribution
- # Apply LME residual perturbations (non-normal residuals)
- if metadata.cluster_specs and scenario_config is not None:
- from ..core.scenarios import apply_lme_residual_perturbations
-
- y = apply_lme_residual_perturbations(y, scenario_config, sim_seed)
-
# Determine cluster IDs for the solver
cluster_ids: Optional[np.ndarray]
if re_result is not None:
@@ -496,23 +525,25 @@ def _single_simulation(
cluster_ids = metadata.cluster_ids_template
# Route to correct analysis method
- if cluster_ids is not None:
+ # When test_formula specifies no random effects, use OLS even if generation has clusters
+ use_lme = cluster_ids is not None and not (metadata.test_column_indices is not None and not metadata.test_has_random_effects)
+ if use_lme:
# Mixed model path (LME)
from ..stats.mixed_models import _lme_analysis_wrapper
+ assert cluster_ids is not None # narrowed by use_lme guard above
lme_result = _lme_analysis_wrapper(
- X_expanded,
+ X_test,
y,
- metadata.target_indices,
+ test_target_indices,
cluster_ids,
- metadata.cluster_column_indices,
metadata.correction_method,
self.alpha,
backend="custom",
verbose=metadata.verbose,
- chi2_crit=getattr(metadata, "lme_chi2_crit", None),
- z_crit=getattr(metadata, "lme_z_crit", None),
- correction_z_crits=getattr(metadata, "lme_correction_z_crits", None),
+ chi2_crit=metadata.lme_chi2_crit,
+ z_crit=metadata.lme_z_crit,
+ correction_z_crits=metadata.lme_correction_z_crits,
re_result=re_result,
)
@@ -539,16 +570,20 @@ def _single_simulation(
else:
# Standard OLS path
results = analyze_func(
- X_expanded,
+ X_test,
y,
- metadata.target_indices,
+ test_target_indices,
self.alpha,
metadata.correction_method,
)
diagnostics = None
- # Extract results: [f_sig, uncorr..., corr..., (wald_flag)]
- n_targets = len(metadata.target_indices)
+ # Result array layout: [F_sig, uncorrected[n_targets], corrected[n_targets], wald_flag?]
+ # - F_sig (index 0): overall model F-test significance (1.0 or 0.0)
+ # - uncorrected[1..n]: per-target t-test significance without correction
+ # - corrected[n+1..2n]: per-target significance with multiple-comparison correction
+ # - wald_flag (optional, LME only): 1.0 if Wald test was used instead of LRT
+ n_targets = len(test_target_indices)
f_significant = bool(results[0])
uncorrected = results[1 : 1 + n_targets].astype(bool)
corrected = results[1 + n_targets : 1 + 2 * n_targets].astype(bool)
@@ -560,19 +595,19 @@ def _single_simulation(
wald_flag = bool(results[expected_len])
# Post-hoc pairwise contrasts (OLS path only)
- if metadata.posthoc_specs and cluster_ids is None:
+ if metadata.posthoc_specs and not use_lme:
from ..stats.ols import compute_posthoc_contrasts
ph_uncorr, ph_corr, regular_override = compute_posthoc_contrasts(
- X_expanded,
+ X_test,
y,
metadata.posthoc_specs,
metadata.posthoc_method,
metadata.posthoc_t_crit,
metadata.posthoc_tukey_crits,
- target_indices=metadata.target_indices,
+ target_indices=test_target_indices,
correction_method=metadata.correction_method,
- correction_t_crits_combined=getattr(metadata, "posthoc_correction_t_crits_combined", None),
+ correction_t_crits_combined=metadata.posthoc_correction_t_crits_combined,
)
# If FDR/Holm combined correction was applied, override regular corrected
@@ -645,7 +680,7 @@ class SimulationMetadata:
correction_method: Encoded multiple-comparison correction
(0=none, 1=Bonferroni, 2=BH, 3=Holm).
heterogeneity: SD of random effect-size multiplier.
- heteroskedasticity: Correlation between first predictor and error SD.
+ heteroskedasticity: Correlation between predicted values and error SD.
preserve_correlation: Upload correlation mode
(``"no"``/``"partial"``/``"strict"``).
uploaded_raw_data: Normalised raw data for strict-mode bootstrap.
@@ -728,6 +763,13 @@ def __init__(
self.posthoc_method: str = "t-test"
self.posthoc_tukey_crits: Dict[str, float] = {}
self.posthoc_t_crit: float = 0.0
+ self.posthoc_correction_t_crits_combined: Optional[np.ndarray] = None
+
+ # Test formula fields (for model misspecification testing)
+ self.test_column_indices: Optional[np.ndarray] = None
+ self.test_target_indices: Optional[np.ndarray] = None
+ self.test_effect_count: Optional[int] = None # p for critical value computation
+ self.test_has_random_effects: bool = False # Whether test formula has (1|group) etc.
def _compute_fixed_effect_variance(registry) -> float:
@@ -779,8 +821,13 @@ def _compute_fixed_effect_variance(registry) -> float:
factor_info = registry._factors[factor_name]
proportions = factor_info.get("proportions")
if proportions is not None:
- # level is 1-indexed; proportions list is 0-indexed
- p_k = proportions[level - 1]
+ level_labels = factor_info.get("level_labels")
+ if level_labels is not None:
+ # String level labels — look up position by label
+ p_k = proportions[level_labels.index(str(level))]
+ else:
+ # Integer levels are 1-indexed; proportions list is 0-indexed
+ p_k = proportions[level - 1]
else:
# Equal proportions (default)
n_levels = factor_info["n_levels"]
@@ -814,6 +861,7 @@ def prepare_metadata(
model,
target_tests: List[str],
correction: Optional[str] = None,
+ test_formula_effects: Optional[List[str]] = None,
) -> SimulationMetadata:
"""
Prepare simulation metadata from model state.
@@ -825,6 +873,9 @@ def prepare_metadata(
model: MCPowerModel instance
target_tests: List of effects to test
correction: Multiple comparison correction method
+ test_formula_effects: Optional list of effect names from a test
+ formula. When provided, the metadata will include column
+ indices for subsetting X_expanded to the test model.
Returns:
SimulationMetadata instance
@@ -960,8 +1011,6 @@ def prepare_metadata(
upload_data_values=model.upload_data_values if model.upload_data_values is not None else np.zeros((2, 2), dtype=np.float64),
effect_sizes=effect_sizes,
correction_method=correction_method,
- heterogeneity=model.heterogeneity,
- heteroskedasticity=model.heteroskedasticity,
preserve_correlation=model._preserve_correlation,
uploaded_raw_data=model._uploaded_raw_data,
uploaded_var_metadata=model._uploaded_var_metadata,
@@ -982,4 +1031,20 @@ def prepare_metadata(
metadata.posthoc_specs = model._posthoc_specs
metadata.posthoc_method = "tukey" if is_tukey_correction else "t-test"
+ # Test formula column subsetting
+ if test_formula_effects is not None:
+ from ..utils.test_formula_utils import _compute_test_column_indices, _remap_target_indices
+
+ # Get all non-cluster effect names in registry order
+ all_effect_names = [name for name in registry._effects if name not in registry.cluster_effect_names]
+
+ test_col_indices = _compute_test_column_indices(all_effect_names, test_formula_effects)
+ metadata.test_column_indices = test_col_indices
+ metadata.test_effect_count = len(test_col_indices)
+
+ # Remap target indices to X_test space
+ # Only remap targets that exist in the test formula
+ valid_targets = np.array([idx for idx in target_indices if idx in test_col_indices], dtype=np.int64)
+ metadata.test_target_indices = _remap_target_indices(valid_targets, test_col_indices)
+
return metadata
diff --git a/mcpower/core/variables.py b/mcpower/core/variables.py
index e311606..338d40b 100644
--- a/mcpower/core/variables.py
+++ b/mcpower/core/variables.py
@@ -367,76 +367,42 @@ def expand_factors(self) -> None:
level_labels = factor_info.get("level_labels")
reference_level = factor_info.get("reference_level", 1)
+ # Compute non-reference levels once
if level_labels is not None:
- # Named levels: skip the reference, create dummies for the rest
- non_ref_labels = [lb for lb in level_labels if lb != str(reference_level)]
- for label in non_ref_labels:
- dummy_name = f"{factor_name}[{label}]"
-
- # Create dummy predictor
- dummy_pred = PredictorVar(
- name=dummy_name,
- var_type="factor_dummy",
- is_dummy=True,
- factor_source=factor_name,
- factor_level=label,
- column_index=col_idx,
- level_labels=level_labels,
- )
- new_predictors[dummy_name] = dummy_pred
-
- # Create main effect for dummy
- dummy_eff = Effect(
- name=dummy_name,
- effect_type="main",
- var_names=[dummy_name],
- column_index=col_idx,
- factor_source=factor_name,
- factor_level=label,
- )
- new_effects[dummy_name] = dummy_eff
-
- # Store dummy mapping
- self._factor_dummies[dummy_name] = {
- "factor_name": factor_name,
- "level": label,
- }
-
- col_idx += 1
+ non_ref = [lb for lb in level_labels if lb != str(reference_level)]
else:
- # Original integer-indexed behavior
- for level in range(2, n_levels + 1):
- dummy_name = f"{factor_name}[{level}]"
-
- # Create dummy predictor
- dummy_pred = PredictorVar(
- name=dummy_name,
- var_type="factor_dummy",
- is_dummy=True,
- factor_source=factor_name,
- factor_level=level,
- column_index=col_idx,
- )
- new_predictors[dummy_name] = dummy_pred
-
- # Create main effect for dummy
- dummy_eff = Effect(
- name=dummy_name,
- effect_type="main",
- var_names=[dummy_name],
- column_index=col_idx,
- factor_source=factor_name,
- factor_level=level,
- )
- new_effects[dummy_name] = dummy_eff
+ non_ref = list(range(2, n_levels + 1))
+
+ for level in non_ref:
+ dummy_name = f"{factor_name}[{level}]"
+
+ dummy_pred = PredictorVar(
+ name=dummy_name,
+ var_type="factor_dummy",
+ is_dummy=True,
+ factor_source=factor_name,
+ factor_level=level,
+ column_index=col_idx,
+ level_labels=level_labels if level_labels is not None else None,
+ )
+ new_predictors[dummy_name] = dummy_pred
+
+ dummy_eff = Effect(
+ name=dummy_name,
+ effect_type="main",
+ var_names=[dummy_name],
+ column_index=col_idx,
+ factor_source=factor_name,
+ factor_level=level,
+ )
+ new_effects[dummy_name] = dummy_eff
- # Store dummy mapping
- self._factor_dummies[dummy_name] = {
- "factor_name": factor_name,
- "level": level,
- }
+ self._factor_dummies[dummy_name] = {
+ "factor_name": factor_name,
+ "level": level,
+ }
- col_idx += 1
+ col_idx += 1
# Handle interactions involving factors — Cartesian product of
# non-reference dummy levels across all factor components.
@@ -503,6 +469,13 @@ def get_effect_sizes(self) -> np.ndarray:
def get_var_types(self) -> np.ndarray:
"""Get variable types as numpy array (for data generation)."""
+ # Type codes: 0-5 are parametric distributions generated from scratch.
+ # 97/98/99 are sentinel codes for uploaded-data variables whose values
+ # come from bootstrapped/quantile-matched empirical data rather than
+ # parametric generation:
+ # 97 = uploaded_factor (factor from uploaded data)
+ # 98 = uploaded_binary (binary from uploaded data)
+ # 99 = uploaded_data (continuous from uploaded data)
type_mapping = {
"normal": 0,
"binary": 1,
@@ -717,28 +690,20 @@ def register_cluster(
def _reindex_predictors(self) -> None:
"""Reindex all predictors to maintain order: non_factor | cluster_effect | dummies."""
- col_idx = 0
+ non_factor = []
+ cluster = []
+ dummies = []
- # Non-factor predictors first
- for name in sorted(self._predictors.keys(), key=lambda x: self._predictors[x].column_index or 0):
- pred = self._predictors[name]
- if not pred.is_factor and not pred.is_dummy and pred.var_type != "cluster_effect":
- pred.column_index = col_idx
- col_idx += 1
-
- # Cluster effect predictors second
- for name in sorted(self._predictors.keys(), key=lambda x: self._predictors[x].column_index or 0):
- pred = self._predictors[name]
- if pred.var_type == "cluster_effect":
- pred.column_index = col_idx
- col_idx += 1
-
- # Factor dummies last
for name in sorted(self._predictors.keys(), key=lambda x: self._predictors[x].column_index or 0):
pred = self._predictors[name]
if pred.is_dummy:
- pred.column_index = col_idx
- col_idx += 1
+ dummies.append(pred)
+ elif pred.var_type == "cluster_effect":
+ cluster.append(pred)
+ elif not pred.is_factor:
+ non_factor.append(pred)
+
+ for col_idx, pred in enumerate(non_factor + cluster + dummies):
+ pred.column_index = col_idx
- # Update effect indices
self._update_effect_indices()
diff --git a/mcpower/model.py b/mcpower/model.py
index 4fa2c4b..9a5f813 100644
--- a/mcpower/model.py
+++ b/mcpower/model.py
@@ -123,10 +123,9 @@ def __init__(self, data_generation_formula: str):
self._pending_factor_levels: Optional[str] = None
self._pending_effects: Optional[str] = None
self._pending_correlations: Optional[Union[str, np.ndarray]] = None
- self._pending_heterogeneity: Optional[float] = None
- self._pending_heteroskedasticity: Optional[float] = None
self._pending_data: Optional[Dict[str, Any]] = None
self._pending_clusters: Dict[str, Dict] = {} # {grouping_var: {n_clusters, cluster_size, icc}}
+ self._effects_set: bool = False # True after set_effects() has been called
# Detect mixed model formula
if self._registry._random_effects_parsed:
@@ -134,8 +133,6 @@ def __init__(self, data_generation_formula: str):
# Applied state
self._applied = False
- self.heterogeneity = 0.0
- self.heteroskedasticity = 0.0
# Data storage
self.upload_normal_values: Optional[np.ndarray] = None
@@ -385,6 +382,7 @@ def set_effects(self, effects_string: str):
raise ValueError("effects_string cannot be empty")
self._pending_effects = effects_string
+ self._effects_set = True
self._applied = False
return self
@@ -432,13 +430,16 @@ def set_variable_type(self, variable_types_string: str):
- ``"normal"`` — standard normal (default).
- ``"binary"`` or ``"binary(p)"`` — Bernoulli with proportion *p*
(default 0.5).
- - ``"skewed"`` — heavy-tailed (t-distribution, df=3).
+ - ``"right_skewed"`` — positively skewed distribution.
+ - ``"left_skewed"`` — negatively skewed distribution.
+ - ``"high_kurtosis"`` — heavy-tailed (t-distribution, df=3).
+ - ``"uniform"`` — uniform distribution.
- ``"factor(k)"`` — categorical with *k* levels (creates *k-1*
dummy variables).
- ``"factor(k, p1, p2, ...)"`` — factor with custom level
proportions.
- Example: ``"x1=binary, x2=skewed, x3=factor(3)"``.
+ Example: ``"x1=binary, x2=right_skewed, x3=factor(3)"``.
Returns:
self: For method chaining.
@@ -479,62 +480,6 @@ def set_factor_levels(self, spec: str):
self._applied = False
return self
- def set_heterogeneity(self, heterogeneity: float):
- """Set heterogeneity (random variation) in effect sizes.
-
- When non-zero, each simulation draws a per-simulation effect-size
- multiplier from a normal distribution with mean 1 and the given
- standard deviation. This models uncertainty about the true effect
- size — for example, ``heterogeneity=0.1`` means effect sizes vary
- by roughly +/- 10% across simulations.
-
- This setting is deferred until ``apply()`` is called.
-
- Args:
- heterogeneity: Standard deviation of the random effect-size
- multiplier. Must be non-negative. Default is 0 (no variation).
-
- Returns:
- self: For method chaining.
-
- Raises:
- TypeError: If *heterogeneity* is not numeric.
- """
- if not isinstance(heterogeneity, (int, float)):
- raise TypeError("heterogeneity must be a number")
-
- self._pending_heterogeneity = float(heterogeneity)
- self._applied = False
- return self
-
- def set_heteroskedasticity(self, heteroskedasticity_correlation: float):
- """Set heteroskedasticity (non-constant error variance).
-
- Introduces a correlation between the first predictor's values and
- the error standard deviation, producing variance that increases (or
- decreases) with the predictor. This violates the homoskedasticity
- assumption and typically reduces power.
-
- This setting is deferred until ``apply()`` is called.
-
- Args:
- heteroskedasticity_correlation: Correlation between the first
- predictor and the error standard deviation, in the range
- [-1, 1]. Default is 0 (homoskedastic errors).
-
- Returns:
- self: For method chaining.
-
- Raises:
- TypeError: If the value is not numeric.
- """
- if not isinstance(heteroskedasticity_correlation, (int, float)):
- raise TypeError("heteroskedasticity_correlation must be a number")
-
- self._pending_heteroskedasticity = float(heteroskedasticity_correlation)
- self._applied = False
- return self
-
def set_cluster(
self,
grouping_var: str,
@@ -769,6 +714,7 @@ def upload_data(
"preserve_factor_level_names": preserve_factor_level_names,
}
self._applied = False
+ return self
def set_scenario_configs(self, configs_dict: Dict):
"""Set custom scenario configurations for robustness analysis.
@@ -786,7 +732,9 @@ def set_scenario_configs(self, configs_dict: Dict):
configs_dict: Mapping of scenario names to configuration dicts.
Each configuration may include keys such as
``"heterogeneity"``, ``"heteroskedasticity"``,
- ``"effect_size_jitter"``, and ``"distribution_jitter"``.
+ ``"correlation_noise_sd"``, and ``"distribution_change_prob"``.
+ See ``DEFAULT_SCENARIO_CONFIG`` in ``mcpower.core.scenarios``
+ for the full list of keys.
Returns:
self: For method chaining.
@@ -802,7 +750,8 @@ def set_scenario_configs(self, configs_dict: Dict):
if scenario in merged:
merged[scenario].update(config)
else:
- merged[scenario] = config
+ # New custom scenarios inherit all keys from optimistic baseline
+ merged[scenario] = {**DEFAULT_SCENARIO_CONFIG["optimistic"], **config}
self._scenario_configs = merged
print(f"Custom scenario configs set: {', '.join(configs_dict.keys())}")
@@ -812,7 +761,7 @@ def set_scenario_configs(self, configs_dict: Dict):
# Apply method (processes all pending settings)
# =========================================================================
- def apply(self):
+ def _apply(self):
"""
Apply all pending settings to the model.
@@ -857,16 +806,22 @@ def apply(self):
# 7. Apply correlations
self._apply_correlations(_parser)
- # 8. Apply heterogeneity/heteroskedasticity
- self._apply_heterogeneity()
-
- # 9. Validate model is ready
+ # 8. Validate model is ready
model_result = _validate_model_ready(self)
model_result.raise_if_invalid()
- # Invalidate effect plan cache when settings change (Phase 2 optimization)
+ # Invalidate the effect plan cache — apply() rebuilds the variable
+ # registry state, so any cached column mappings are now stale.
self._effect_plan_cache = None
+ # Clear pending state to prevent double-application
+ self._pending_variable_types = None
+ self._pending_factor_levels = None
+ self._pending_effects = None
+ self._pending_correlations = None
+ self._pending_data = None
+ self._pending_clusters = {}
+
self._applied = True
print("Model settings applied successfully")
return self
@@ -1024,6 +979,21 @@ def _apply_data(self):
# Extract matched data
matched_data = data[:, matched_indices]
+ # Reject NaN values early
+ try:
+ if np.isnan(matched_data.astype(np.float64)).any():
+ nan_cols = [
+ matched_columns[i] for i in range(matched_data.shape[1]) if np.isnan(matched_data[:, i].astype(np.float64)).any()
+ ]
+ raise ValueError(
+ f"Uploaded data contains NaN values in columns: {', '.join(nan_cols)}. "
+ f"Remove or impute missing values before uploading."
+ )
+ except (ValueError, TypeError):
+ # Object dtype columns (strings) can't be converted to float for NaN check.
+ # NaN check for numeric columns will happen after string encoding below.
+ pass
+
# Convert to float64 if object dtype (common with mixed-type DataFrames)
# String columns are encoded to integer indices; mapping is stored in string_col_indices
string_col_indices = {}
@@ -1178,11 +1148,7 @@ def _apply_data_normal_mode(self, data, columns, type_info, mode, data_types_ove
level_labels = info.get("level_labels")
# Determine reference from data_types tuple override
- reference_level = None
- if col in data_types_override:
- dt = data_types_override[col]
- if isinstance(dt, tuple) and len(dt) == 2:
- reference_level = str(dt[1])
+ reference_level = self._extract_reference_level(data_types_override, col)
# Calculate proportions for each level
proportions = []
@@ -1200,7 +1166,10 @@ def _apply_data_normal_mode(self, data, columns, type_info, mode, data_types_ove
else: # continuous
# Normalize: mean=0, sd=1
- normalized = (col_data - np.mean(col_data)) / np.std(col_data, ddof=1)
+ std = np.std(col_data, ddof=1)
+ if std < 1e-15:
+ raise ValueError(f"Column '{col}' has zero variance (constant value). Remove it from the model or check your data.")
+ normalized = (col_data - np.mean(col_data)) / std
# Create lookup tables (type 99)
normal_vals, uploaded_vals = create_uploaded_lookup_tables(normalized.reshape(-1, 1))
@@ -1324,11 +1293,7 @@ def _apply_data_strict_mode(self, data, columns, type_info, data_types_override=
level_labels = info.get("level_labels")
# Determine reference from data_types tuple override
- reference_level = None
- if col in data_types_override:
- dt = data_types_override[col]
- if isinstance(dt, tuple) and len(dt) == 2:
- reference_level = str(dt[1])
+ reference_level = self._extract_reference_level(data_types_override, col)
self._uploaded_var_metadata[col] = {
"type": "factor",
@@ -1355,7 +1320,10 @@ def _apply_data_strict_mode(self, data, columns, type_info, data_types_override=
continuous_cols.append(idx)
# Normalize
col_data = data[:, idx]
- normalized_data[:, idx] = (col_data - np.mean(col_data)) / np.std(col_data, ddof=1)
+ std = np.std(col_data, ddof=1)
+ if std < 1e-15:
+ raise ValueError(f"Column '{col}' has zero variance (constant value). Remove it from the model or check your data.")
+ normalized_data[:, idx] = (col_data - np.mean(col_data)) / std
self._uploaded_var_metadata[col] = {
"type": "continuous",
@@ -1481,22 +1449,6 @@ def _apply_correlations(self, _parser):
self._registry.set_correlation_matrix(correlations_input)
print("Correlation matrix set")
- def _apply_heterogeneity(self):
- """Validate and apply pending heterogeneity and heteroskedasticity settings."""
- if self._pending_heterogeneity is not None:
- if self._pending_heterogeneity < 0:
- raise ValueError("heterogeneity must be non-negative")
- self.heterogeneity = self._pending_heterogeneity
- if self.heterogeneity > 0:
- print(f"Heterogeneity: SD = {self.heterogeneity}")
-
- if self._pending_heteroskedasticity is not None:
- if not -1 <= self._pending_heteroskedasticity <= 1:
- raise ValueError("heteroskedasticity_correlation must be between -1 and 1")
- self.heteroskedasticity = self._pending_heteroskedasticity
- if abs(self.heteroskedasticity) > 1e-8:
- print(f"Heteroskedasticity: correlation = {self.heteroskedasticity}")
-
# =========================================================================
# Analysis methods
# =========================================================================
@@ -1507,7 +1459,7 @@ def find_power(
target_test: str = "all",
correction: Optional[str] = None,
print_results: bool = True,
- scenarios: bool = False,
+ scenarios: Union[bool, List[str]] = False,
summary: str = "short",
return_results: bool = False,
test_formula: str = "",
@@ -1529,12 +1481,16 @@ def find_power(
Duplicate tests raise ``ValueError``.
correction: Multiple comparison correction (None, "bonferroni", "benjamini-hochberg", "holm")
print_results: Whether to print results
- scenarios: Run scenario analysis
+ scenarios: Scenario analysis control:
+ - ``False`` (default): no scenario analysis.
+ - ``True``: run all configured scenarios.
+ - List of scenario names: run only the specified scenarios
+ (e.g. ``["optimistic", "extreme"]``). Case-insensitive.
summary: Output detail level ("short" or "long")
return_results: Return results dict
test_formula: Formula for statistical testing (default: use data generation formula).
If the formula contains random effects like (1|school), analysis switches to
- mixed model testing (not yet implemented).
+ mixed model testing.
progress_callback: Progress reporting control:
- ``None`` (default): auto-use ``PrintReporter`` when
*print_results* is ``True``.
@@ -1549,7 +1505,10 @@ def find_power(
"""
# Auto-apply if settings have changed
if not self._applied:
- self.apply()
+ self._apply()
+
+ # Resolve scenarios parameter
+ scenario_filter = self._resolve_scenarios(scenarios)
# Validate sample size (basic: >= 20, type check)
_validate_sample_size(sample_size).raise_if_invalid()
@@ -1558,9 +1517,6 @@ def find_power(
n_variables = len(self._registry.effect_names)
_validate_sample_size_for_model(sample_size, n_variables).raise_if_invalid()
- # Validate and adjust cluster sample sizes
- self._validate_cluster_sample_size(sample_size)
-
# Warn if sample size is much larger than uploaded data
if self._uploaded_data_n > 0 and sample_size > 3 * self._uploaded_data_n:
print(
@@ -1570,33 +1526,13 @@ def find_power(
)
self._validate_analysis_inputs(correction)
- resolved_test_formula = self._resolve_test_formula(test_formula)
- target_tests = self._parse_target_tests(target_test)
-
- if correction and correction.lower() == "tukey" and not self._posthoc_specs:
- raise ValueError(
- "Tukey correction requires at least one post-hoc comparison "
- "(e.g., target_test='group[0] vs group[1]'). "
- "Tukey HSD only applies to pairwise contrasts between factor levels."
- )
-
- # Resolve progress callback
- from .progress import PrintReporter, ProgressReporter, compute_total_simulations
-
- if progress_callback is None:
- effective_cb = PrintReporter() if print_results else None
- elif progress_callback is False:
- effective_cb = None
- else:
- effective_cb = progress_callback
+ resolved_test_formula, test_formula_effects, test_random_effects = self._resolve_test_formula(test_formula)
+ target_tests = self._parse_target_tests(target_test, test_formula_effects=test_formula_effects)
+ self._validate_tukey_posthoc(correction)
- reporter = None
- if effective_cb is not None:
- n_scenarios = (len(self._scenario_configs or DEFAULT_SCENARIO_CONFIG) + 1) if scenarios else 1
- total = compute_total_simulations(self._effective_n_simulations, 1, n_scenarios)
- reporter = ProgressReporter(total, effective_cb)
+ reporter = self._resolve_progress(progress_callback, print_results, scenario_filter)
- if scenarios:
+ if scenario_filter is not None:
result = self._run_scenario_analysis(
"power",
sample_size=sample_size,
@@ -1605,8 +1541,11 @@ def find_power(
summary=summary,
print_results=print_results,
test_formula=resolved_test_formula,
+ test_formula_effects=test_formula_effects,
+ test_random_effects=test_random_effects,
progress=reporter,
cancel_check=cancel_check,
+ scenario_filter=scenario_filter,
)
else:
if reporter is not None:
@@ -1615,7 +1554,10 @@ def find_power(
sample_size,
target_tests,
correction,
+ scenario_config=DEFAULT_SCENARIO_CONFIG["optimistic"],
test_formula=resolved_test_formula,
+ test_formula_effects=test_formula_effects,
+ test_random_effects=test_random_effects,
progress=reporter,
cancel_check=cancel_check,
)
@@ -1623,7 +1565,7 @@ def find_power(
if reporter is not None:
reporter.finish()
- if not scenarios and print_results:
+ if scenario_filter is None and print_results:
print(f"\n{'=' * 80}")
print("MONTE CARLO POWER ANALYSIS RESULTS")
print(f"{'=' * 80}")
@@ -1641,7 +1583,7 @@ def find_sample_size(
by: int = 5,
correction: Optional[str] = None,
print_results: bool = True,
- scenarios: bool = False,
+ scenarios: Union[bool, List[str]] = False,
summary: str = "short",
return_results: bool = False,
test_formula: str = "",
@@ -1659,12 +1601,16 @@ def find_sample_size(
by: Step size between sample sizes
correction: Multiple comparison correction
print_results: Whether to print results
- scenarios: Run scenario analysis
+ scenarios: Scenario analysis control:
+ - ``False`` (default): no scenario analysis.
+ - ``True``: run all configured scenarios.
+ - List of scenario names: run only the specified scenarios
+ (e.g. ``["optimistic", "extreme"]``). Case-insensitive.
summary: Output detail level
return_results: Return results dict
test_formula: Formula for statistical testing (default: use data generation formula).
If the formula contains random effects like (1|school), analysis switches to
- mixed model testing (not yet implemented).
+ mixed model testing.
progress_callback: Progress reporting control:
- ``None`` (default): auto-use ``PrintReporter`` when
*print_results* is ``True``.
@@ -1680,7 +1626,10 @@ def find_sample_size(
"""
# Auto-apply if settings have changed
if not self._applied:
- self.apply()
+ self._apply()
+
+ # Resolve scenarios parameter
+ scenario_filter = self._resolve_scenarios(scenarios)
# Validate from_size meets minimum requirements
_validate_sample_size(from_size).raise_if_invalid()
@@ -1696,40 +1645,20 @@ def find_sample_size(
)
self._validate_analysis_inputs(correction)
- resolved_test_formula = self._resolve_test_formula(test_formula)
+ resolved_test_formula, test_formula_effects, test_random_effects = self._resolve_test_formula(test_formula)
validation_result = _validate_sample_size_range(from_size, to_size, by)
for warning in validation_result.warnings:
print(f"Warning: {warning}")
validation_result.raise_if_invalid()
- target_tests = self._parse_target_tests(target_test)
-
- if correction and correction.lower() == "tukey" and not self._posthoc_specs:
- raise ValueError(
- "Tukey correction requires at least one post-hoc comparison "
- "(e.g., target_test='group[0] vs group[1]'). "
- "Tukey HSD only applies to pairwise contrasts between factor levels."
- )
+ target_tests = self._parse_target_tests(target_test, test_formula_effects=test_formula_effects)
+ self._validate_tukey_posthoc(correction)
sample_sizes = list(range(from_size, to_size + 1, by))
- # Resolve progress callback
- from .progress import PrintReporter, ProgressReporter, compute_total_simulations
-
- if progress_callback is None:
- effective_cb = PrintReporter() if print_results else None
- elif progress_callback is False:
- effective_cb = None
- else:
- effective_cb = progress_callback
-
- reporter = None
- if effective_cb is not None:
- n_scenarios = (len(self._scenario_configs or DEFAULT_SCENARIO_CONFIG) + 1) if scenarios else 1
- total = compute_total_simulations(self._effective_n_simulations, len(sample_sizes), n_scenarios)
- reporter = ProgressReporter(total, effective_cb)
+ reporter = self._resolve_progress(progress_callback, print_results, scenario_filter, n_sample_sizes=len(sample_sizes))
- if scenarios:
+ if scenario_filter is not None:
result = self._run_scenario_analysis(
"sample_size",
target_tests=target_tests,
@@ -1738,8 +1667,11 @@ def find_sample_size(
summary=summary,
print_results=print_results,
test_formula=resolved_test_formula,
+ test_formula_effects=test_formula_effects,
+ test_random_effects=test_random_effects,
progress=reporter,
cancel_check=cancel_check,
+ scenario_filter=scenario_filter,
)
else:
if reporter is not None:
@@ -1748,7 +1680,10 @@ def find_sample_size(
sample_sizes,
target_tests,
correction,
+ scenario_config=DEFAULT_SCENARIO_CONFIG["optimistic"],
test_formula=resolved_test_formula,
+ test_formula_effects=test_formula_effects,
+ test_random_effects=test_random_effects,
progress=reporter,
cancel_check=cancel_check,
)
@@ -1756,7 +1691,7 @@ def find_sample_size(
if reporter is not None:
reporter.finish()
- if not scenarios and print_results:
+ if scenario_filter is None and print_results:
print(f"\n{'=' * 80}")
print("SAMPLE SIZE ANALYSIS RESULTS")
print(f"{'=' * 80}")
@@ -1780,6 +1715,8 @@ def _generate_dependent_variable(
heterogeneity: float = 0.0,
heteroskedasticity: float = 0.0,
sim_seed: Optional[int] = None,
+ residual_dist: int = 0,
+ residual_df: float = 10.0,
) -> np.ndarray:
"""Generate the dependent variable as y = X @ beta + error via the active backend."""
return get_backend().generate_y(
@@ -1788,19 +1725,103 @@ def _generate_dependent_variable(
heterogeneity,
heteroskedasticity,
sim_seed if sim_seed is not None else -1,
+ residual_dist,
+ residual_df,
)
# =========================================================================
# Internal methods
# =========================================================================
+ @staticmethod
+ def _extract_reference_level(data_types_override, col):
+ """Extract reference level from data_types_override tuple for a column."""
+ dt = data_types_override.get(col)
+ if isinstance(dt, tuple) and len(dt) == 2:
+ return str(dt[1])
+ return None
+
+ def _resolve_scenarios(self, scenarios: Union[bool, List[str]]) -> Optional[List[str]]:
+ """Resolve the scenarios parameter into a list of scenario names or None.
+
+ Args:
+ scenarios: ``False`` for no scenarios, ``True`` for all configured
+ scenarios, or a list of scenario names (case-insensitive).
+
+ Returns:
+ List of validated, lowercase scenario names, or ``None`` if
+ scenarios are disabled.
+
+ Raises:
+ ValueError: If any requested scenario name is not configured.
+ TypeError: If *scenarios* is not ``bool`` or a list of strings.
+ """
+ if scenarios is False:
+ return None
+
+ all_configs = self._scenario_configs or DEFAULT_SCENARIO_CONFIG
+ available = set(all_configs.keys())
+
+ if scenarios is True:
+ return list(all_configs.keys())
+
+ if not isinstance(scenarios, list):
+ raise TypeError(f"scenarios must be True, False, or a list of scenario names, got {type(scenarios).__name__}")
+
+ # Case-insensitive matching
+ available_lower = {k.lower(): k for k in available}
+ resolved = []
+ invalid = []
+ for name in scenarios:
+ if not isinstance(name, str):
+ raise TypeError(f"Scenario names must be strings, got {type(name).__name__}")
+ key = available_lower.get(name.lower())
+ if key is None:
+ invalid.append(name)
+ else:
+ resolved.append(key)
+
+ if invalid:
+ raise ValueError(f"Unknown scenario(s): {', '.join(repr(n) for n in invalid)}. Available: {', '.join(sorted(available))}")
+
+ return resolved
+
+ def _resolve_progress(self, progress_callback, print_results, scenario_filter, n_sample_sizes=1):
+ """Resolve progress_callback into a ProgressReporter or None."""
+ from .progress import PrintReporter, ProgressReporter, compute_total_simulations
+
+ if progress_callback is None:
+ effective_cb = PrintReporter() if print_results else None
+ elif progress_callback is False:
+ effective_cb = None
+ else:
+ effective_cb = progress_callback
+
+ if effective_cb is None:
+ return None
+
+ n_scenarios = len(scenario_filter) if scenario_filter is not None else 1
+ total = compute_total_simulations(self._effective_n_simulations, n_sample_sizes, n_scenarios)
+ return ProgressReporter(total, effective_cb)
+
def _validate_analysis_inputs(self, correction):
"""Validate the multiple-comparison correction method before analysis."""
result = _validate_correction_method(correction)
result.raise_if_invalid()
+ def _validate_tukey_posthoc(self, correction):
+ """Raise if Tukey correction is requested without posthoc specs."""
+ if correction and correction.lower() == "tukey" and not self._posthoc_specs:
+ raise ValueError(
+ "Tukey correction requires at least one post-hoc comparison "
+ "(e.g., target_test='group[0] vs group[1]'). "
+ "Tukey HSD only applies to pairwise contrasts between factor levels."
+ )
+
def _validate_cluster_sample_size(self, sample_size: int):
"""Derive missing cluster dimensions from sample_size and validate minimums."""
+ # NOTE: This method both validates AND mutates — it derives missing
+ # cluster_size/n_clusters from sample_size before checking minimums.
if not self._registry.cluster_names:
return # No clusters, nothing to do
@@ -1811,10 +1832,12 @@ def _validate_cluster_sample_size(self, sample_size: int):
if spec.n_clusters is not None:
spec.cluster_size = sample_size // spec.n_clusters
else:
- assert spec.cluster_size is not None
+ if spec.cluster_size is None:
+ raise RuntimeError(f"Cluster '{gv}': either n_clusters or cluster_size must be set")
spec.n_clusters = sample_size // spec.cluster_size
- assert spec.n_clusters is not None and spec.cluster_size is not None
+ if spec.n_clusters is None or spec.cluster_size is None:
+ raise RuntimeError(f"Cluster '{gv}': failed to derive n_clusters and cluster_size from sample_size={sample_size}")
actual_n = spec.n_clusters * spec.cluster_size
if actual_n != sample_size:
print(
@@ -1825,7 +1848,7 @@ def _validate_cluster_sample_size(self, sample_size: int):
_validate_cluster_sample_size(sample_size, spec.n_clusters, spec.cluster_size).raise_if_invalid()
- def _parse_target_tests(self, target_test: Union[str, List[str]]) -> List[str]:
+ def _parse_target_tests(self, target_test: Union[str, List[str]], test_formula_effects: Optional[List[str]] = None) -> List[str]:
"""Parse a target_test argument into a list of effect names to test.
Supports regular effect names (e.g. ``"x1"``, ``"overall"``),
@@ -1875,7 +1898,10 @@ def _parse_target_tests(self, target_test: Union[str, List[str]]) -> List[str]:
cluster_effects = self._registry.cluster_effect_names
if "all" in keywords:
- fixed_effects = [e for e in self._registry.effect_names if e not in cluster_effects]
+ if test_formula_effects is not None:
+ fixed_effects = [e for e in test_formula_effects if e not in cluster_effects]
+ else:
+ fixed_effects = [e for e in self._registry.effect_names if e not in cluster_effects]
keyword_expansion += ["overall"] + fixed_effects
if "all-posthoc" in keywords:
@@ -1929,6 +1955,17 @@ def _parse_target_tests(self, target_test: Union[str, List[str]]) -> List[str]:
"(e.g. 'all'), do not also list tests that are already included."
)
+ # -- Phase 7b: Validate explicit tests against test formula ----------------
+ if test_formula_effects is not None:
+ test_formula_set = set(test_formula_effects)
+ for test in expanded:
+ if " vs " in test or test == "overall":
+ continue
+ if test not in test_formula_set:
+ raise ValueError(
+ f"Target test '{test}' is not in the test formula. Available effects: {', '.join(test_formula_effects)}"
+ )
+
# -- Phase 8: Parse posthoc specs + validate ------------------------------
regular_tests: list[str] = []
posthoc_specs: list[PostHocSpec] = []
@@ -1982,6 +2019,8 @@ def _parse_target_tests(self, target_test: Union[str, List[str]]) -> List[str]:
# User level k (k≥2) = dummy factor[k]
effect_order = list(self._registry._effects.keys())
+ # Returns None for the reference level, which is absorbed into the
+ # intercept in dummy coding and has no dedicated design matrix column.
def _level_to_col(factor_name, user_level, _effect_order=effect_order):
factor_info = self._registry._factors[factor_name]
reference = factor_info.get("reference_level", 1)
@@ -2069,30 +2108,60 @@ def _create_X_extended(self, X):
return np.column_stack(columns) if columns else np.empty((X.shape[0], 0))
- def _prepare_metadata(self, target_tests, correction=None):
+ def _prepare_metadata(self, target_tests, correction=None, test_formula_effects=None):
"""Pre-compute all static simulation metadata from the current model state."""
- return prepare_metadata(self, target_tests, correction)
+ return prepare_metadata(self, target_tests, correction, test_formula_effects=test_formula_effects)
- def _resolve_test_formula(self, test_formula: str) -> str:
- """Resolve test formula and update _test_method accordingly.
+ def _resolve_test_formula(self, test_formula: str):
+ """Resolve test formula, validate, parse, and update _test_method.
- Returns the resolved formula string.
+ Returns:
+ Tuple of (formula_string, test_effect_names, random_effects).
+ test_effect_names is None when test_formula is empty (use generation formula).
"""
+ from .utils.parsers import _parse_equation
+
if not test_formula:
resolved = self._registry.equation
- else:
- resolved = test_formula
+ _, _, random_effects = _parse_equation(resolved)
+ if random_effects:
+ self._test_method = "mixed_model"
+ else:
+ self._test_method = "linear_regression"
+ return resolved, None, []
- from .utils.parsers import _parse_equation
+ # Validate test formula variables exist in the model
+ from .utils.validators import _validate_test_formula
- _, _, random_effects = _parse_equation(resolved)
+ available_vars = (
+ [self._registry.dependent] + self._registry.non_factor_names + self._registry.factor_names + self._registry.cluster_names
+ )
+ validation = _validate_test_formula(test_formula, available_vars)
+ validation.raise_if_invalid()
+
+ # Parse test formula to get effects and random effects
+ from .utils.test_formula_utils import _extract_test_formula_effects
+
+ test_effects, random_effects = _extract_test_formula_effects(test_formula, self._registry)
+
+ if not test_effects:
+ raise ValueError(f"test_formula '{test_formula}' contains no testable effects from the data generation model.")
+
+ # Check for OLS -> LME cross (invalid: no cluster data to fit)
+ if random_effects and not self._registry._cluster_specs:
+ grouping_vars = [re["grouping_var"] for re in random_effects]
+ raise ValueError(
+ f"test_formula contains random effects ({grouping_vars}) but the "
+ f"data generation model has no cluster structure. Cannot fit a "
+ f"mixed model to data without clusters."
+ )
if random_effects:
self._test_method = "mixed_model"
else:
self._test_method = "linear_regression"
- return resolved
+ return test_formula, test_effects, random_effects
def _run_find_power(
self,
@@ -2101,6 +2170,8 @@ def _run_find_power(
correction,
scenario_config=None,
test_formula=None,
+ test_formula_effects=None,
+ test_random_effects=None,
progress=None,
cancel_check=None,
):
@@ -2109,13 +2180,15 @@ def _run_find_power(
self._validate_cluster_sample_size(sample_size)
# Route based on test method (routing logic handled in simulation.py)
- metadata = self._prepare_metadata(target_tests, correction)
+ metadata = self._prepare_metadata(target_tests, correction, test_formula_effects)
- if scenario_config:
- metadata.heterogeneity = scenario_config["heterogeneity"]
- metadata.heteroskedasticity = scenario_config["heteroskedasticity"]
- if metadata.cluster_specs:
- metadata.lme_scenario_config = scenario_config
+ # Set the random effects flag for test formula
+ if test_random_effects:
+ metadata.test_has_random_effects = True
+
+ # scenario_config is always a dict (SCENARIO_ZERO or user-provided)
+ metadata.heterogeneity = scenario_config["heterogeneity"]
+ metadata.heteroskedasticity = scenario_config["heteroskedasticity"]
runner = SimulationRunner(
n_simulations=self._effective_n_simulations,
@@ -2127,9 +2200,15 @@ def _run_find_power(
)
# Compute critical values once before the simulation loop
- p = len(metadata.effect_sizes)
+ # Use test formula's effect count for critical values when subsetting
+ if metadata.test_column_indices is not None:
+ p = metadata.test_effect_count
+ n_targets = len(metadata.test_target_indices)
+ else:
+ p = len(metadata.effect_sizes)
+ n_targets = len(metadata.target_indices)
+
dof = sample_size - p - 1
- n_targets = len(metadata.target_indices)
n_posthoc = len(metadata.posthoc_specs)
if n_posthoc > 0 and metadata.posthoc_method == "t-test":
@@ -2185,7 +2264,7 @@ def analyze_func(X, y, indices, alpha, correction):
analyze_func=analyze_func,
create_X_extended_func=self._create_X_extended,
scenario_config=scenario_config,
- apply_perturbations_func=(apply_per_simulation_perturbations if scenario_config else None),
+ apply_perturbations_func=apply_per_simulation_perturbations,
progress=progress,
cancel_check=cancel_check,
)
@@ -2193,11 +2272,17 @@ def analyze_func(X, y, indices, alpha, correction):
if not sim_results:
return {}
+ # When test formula is active, filter target_tests to only effects in the test model
+ effective_target_tests = target_tests
+ if test_formula_effects is not None:
+ test_effect_set = set(test_formula_effects)
+ effective_target_tests = [t for t in target_tests if t == "overall" or t in test_effect_set]
+
processor = ResultsProcessor(target_power=self.power)
power_results = processor.calculate_powers(
sim_results["all_results"],
sim_results["all_results_corrected"],
- target_tests,
+ effective_target_tests,
)
# Add n_simulations_failed to power_results
@@ -2207,13 +2292,13 @@ def analyze_func(X, y, indices, alpha, correction):
# Tukey correction only applies to pairwise contrasts; NaN-ify others
if correction and correction.lower() == "tukey" and power_results.get("individual_powers_corrected"):
posthoc_labels = {s.label for s in self._posthoc_specs}
- for test in target_tests:
+ for test in effective_target_tests:
if test not in posthoc_labels:
power_results["individual_powers_corrected"][test] = float("nan")
return build_power_result(
model_type=self.model_type,
- target_tests=target_tests,
+ target_tests=effective_target_tests,
formula_to_test=test_formula,
equation=self.equation,
sample_size=sample_size,
@@ -2243,12 +2328,15 @@ def _run_sample_size_analysis(
correction,
scenario_config=None,
test_formula=None,
+ test_formula_effects=None,
+ test_random_effects=None,
progress=None,
cancel_check=None,
):
"""Iterate over sample sizes, running power analysis for each."""
from .progress import SimulationCancelled
+ use_sequential = True
if self._is_parallel_effective():
from joblib import Parallel, delayed
@@ -2258,7 +2346,18 @@ def _run_sample_size_analysis(
backend="loky",
verbose=0,
return_as="generator",
- )(delayed(self._run_find_power)(ss, target_tests, correction, scenario_config, test_formula) for ss in sample_sizes)
+ )(
+ delayed(self._run_find_power)(
+ ss,
+ target_tests,
+ correction,
+ scenario_config,
+ test_formula,
+ test_formula_effects,
+ test_random_effects,
+ )
+ for ss in sample_sizes
+ )
results = []
for ss, result in zip(sample_sizes, power_results, strict=False):
if cancel_check is not None and cancel_check():
@@ -2266,25 +2365,13 @@ def _run_sample_size_analysis(
results.append((ss, result))
if progress is not None:
progress.advance(self._effective_n_simulations)
+ use_sequential = False
except Exception as e:
if isinstance(e, SimulationCancelled):
raise
print(f"Warning: Parallel execution failed ({e}). Falling back to sequential.")
- results = []
- for ss in sample_sizes:
- if cancel_check is not None and cancel_check():
- raise SimulationCancelled("Simulation cancelled by user") from None
- result = self._run_find_power(
- ss,
- target_tests,
- correction,
- scenario_config,
- test_formula,
- progress=progress,
- cancel_check=cancel_check,
- )
- results.append((ss, result))
- else:
+
+ if use_sequential:
results = []
for sample_size in sample_sizes:
if cancel_check is not None and cancel_check():
@@ -2295,19 +2382,28 @@ def _run_sample_size_analysis(
correction,
scenario_config,
test_formula,
+ test_formula_effects=test_formula_effects,
+ test_random_effects=test_random_effects,
progress=progress,
cancel_check=cancel_check,
)
results.append((sample_size, power_result))
processor = ResultsProcessor(target_power=self.power)
- analysis_results = processor.process_sample_size_results(results, target_tests, correction)
+ # Filter target_tests to match test formula effects
+ if test_formula_effects is not None:
+ test_set = set(test_formula_effects)
+ effective_target_tests = [t for t in target_tests if t in test_set or t == "overall"]
+ else:
+ effective_target_tests = target_tests
+
+ analysis_results = processor.process_sample_size_results(results, effective_target_tests, correction)
# Tukey correction only applies to pairwise contrasts; NaN-ify others
if correction and correction.lower() == "tukey":
posthoc_labels = {s.label for s in self._posthoc_specs}
if analysis_results.get("powers_by_test_corrected"):
- for test in target_tests:
+ for test in effective_target_tests:
if test not in posthoc_labels:
n_points = len(analysis_results["powers_by_test_corrected"][test])
analysis_results["powers_by_test_corrected"][test] = [float("nan")] * n_points
@@ -2315,7 +2411,7 @@ def _run_sample_size_analysis(
return build_sample_size_result(
model_type=self.model_type,
- target_tests=target_tests,
+ target_tests=effective_target_tests,
formula_to_test=test_formula,
equation=self.equation,
sample_sizes=sample_sizes,
@@ -2331,9 +2427,16 @@ def _run_scenario_analysis(self, analysis_type, **kwargs):
"""Delegate to ScenarioRunner for multi-scenario power or sample-size analysis."""
from functools import partial
- configs = self._scenario_configs or DEFAULT_SCENARIO_CONFIG
+ all_configs = self._scenario_configs or DEFAULT_SCENARIO_CONFIG
+ scenario_filter = kwargs.pop("scenario_filter", None)
+ if scenario_filter is not None:
+ configs = {k: all_configs[k] for k in scenario_filter}
+ else:
+ configs = all_configs
scenario_runner = ScenarioRunner(self, configs)
test_formula = kwargs.get("test_formula")
+ test_formula_effects = kwargs.get("test_formula_effects")
+ test_random_effects = kwargs.get("test_random_effects")
progress = kwargs.get("progress")
cancel_check = kwargs.get("cancel_check")
@@ -2341,6 +2444,8 @@ def _run_scenario_analysis(self, analysis_type, **kwargs):
run_power_func = partial(
self._run_find_power,
test_formula=test_formula,
+ test_formula_effects=test_formula_effects,
+ test_random_effects=test_random_effects,
progress=progress,
cancel_check=cancel_check,
)
@@ -2357,6 +2462,8 @@ def _run_scenario_analysis(self, analysis_type, **kwargs):
run_ss_func = partial(
self._run_sample_size_analysis,
test_formula=test_formula,
+ test_formula_effects=test_formula_effects,
+ test_random_effects=test_random_effects,
progress=progress,
cancel_check=cancel_check,
)
diff --git a/mcpower/progress.py b/mcpower/progress.py
index e733148..dca3c25 100644
--- a/mcpower/progress.py
+++ b/mcpower/progress.py
@@ -87,10 +87,7 @@ def __init__(self, **tqdm_kwargs):
self._bar = None
def __call__(self, current: int, total: int):
- try:
- from tqdm import tqdm
- except ImportError:
- raise ImportError("tqdm is required for TqdmReporter. Install with: pip install tqdm") from None
+ from tqdm import tqdm
if self._bar is None:
self._bar = tqdm(total=total, unit="sim", **self._tqdm_kwargs)
diff --git a/mcpower/stats/data_generation.py b/mcpower/stats/data_generation.py
index 0d8800c..3c46d89 100644
--- a/mcpower/stats/data_generation.py
+++ b/mcpower/stats/data_generation.py
@@ -23,7 +23,6 @@
SKEW_STD = np.sqrt(np.exp(2) - np.exp(1))
NORM_SCALE = (DIST_RESOLUTION - 1) / (NORM_RANGE[1] - NORM_RANGE[0])
PERC_SCALE = (DIST_RESOLUTION - 1) / (PERCENTILE_RANGE[1] - PERCENTILE_RANGE[0])
-FLOAT_NEAR_ZERO = 1e-15
# Global lookup tables
NORM_CDF_TABLE = None
@@ -58,13 +57,12 @@ def _compute_t3_sd():
Replicates the vectorised norm-CDF -> t(3)-PPF lookup chain on a large
fixed-seed sample to get a stable SD estimate.
"""
- assert NORM_CDF_TABLE is not None
- assert T3_PPF_TABLE is not None
+ if NORM_CDF_TABLE is None or T3_PPF_TABLE is None:
+ raise RuntimeError("Distribution tables not initialized — _init_tables() must be called first")
- rng_state = np.random.get_state()
- np.random.seed(999999)
- z = np.random.standard_normal(200000)
- np.random.set_state(rng_state)
+ # Use a local RNG to avoid affecting the global state and to be thread-safe.
+ rng = np.random.RandomState(999999)
+ z = rng.standard_normal(200000)
# Step 1: Normal CDF lookup (z -> percentile)
z_clipped = np.clip(z, NORM_RANGE[0], NORM_RANGE[1])
@@ -107,9 +105,16 @@ def create_uploaded_lookup_tables(
for var_idx in range(n_vars):
data = data_matrix[:, var_idx]
- normalized = (data - np.mean(data)) / np.std(data)
+ std = np.std(data)
+ if std < 1e-15:
+ raise ValueError(
+ f"Variable at index {var_idx} has zero variance (constant value). Remove it from the model or check your data."
+ )
+ normalized = (data - np.mean(data)) / std
sorted_uploaded = np.sort(normalized)
+ # Weibull plotting positions: i/(n+1) avoids 0 and 1, which would map
+ # to -inf/+inf under the normal PPF, giving well-behaved quantiles.
percentiles = np.linspace(1 / (n_samples + 1), n_samples / (n_samples + 1), n_samples)
normal_quantiles = norm_ppf_array(percentiles)
@@ -126,13 +131,12 @@ def _generate_factors(sample_size, factor_specs, seed):
Args:
sample_size: Number of observations
factor_specs: List of {'n_levels': int, 'proportions': [float, ...]}
- seed: Random seed
+ seed: Random seed (callers pass sim_seed + 3)
Returns:
X_factors: (sample_size, total_dummies) array
"""
- if seed is not None:
- np.random.seed(seed)
+ rng = np.random.RandomState(seed)
if not factor_specs:
return np.empty((sample_size, 0), dtype=float)
@@ -141,7 +145,7 @@ def _generate_factors(sample_size, factor_specs, seed):
for spec in factor_specs:
n_levels = spec["n_levels"]
proportions = spec["proportions"]
- factor_data = np.random.choice(n_levels, size=sample_size, p=proportions)
+ factor_data = rng.choice(n_levels, size=sample_size, p=proportions)
dummies = np.eye(n_levels, dtype=float)[factor_data]
factor_columns.append(dummies[:, 1:])
@@ -170,12 +174,11 @@ def bootstrap_uploaded_data(
X_non_factors: Non-factor variables (continuous + binary mapped to 0-1)
X_factors: Factor dummy variables
"""
- if seed is not None:
- np.random.seed(seed)
+ rng = np.random.RandomState(seed)
# Bootstrap whole rows
n_samples = raw_data.shape[0]
- row_indices = np.random.choice(n_samples, size=sample_size, replace=True)
+ row_indices = rng.choice(n_samples, size=sample_size, replace=True)
bootstrapped_data = raw_data[row_indices, :]
# Separate by type
@@ -286,12 +289,17 @@ def _generate_cluster_effects(
Returns:
X_cluster: (sample_size, n_cluster_vars) array of random effect columns
"""
- if sim_seed is not None:
- # Use a derived seed to avoid collision with X generation seed
- np.random.seed(sim_seed + 3)
+ rng = np.random.RandomState(sim_seed + 4 if sim_seed is not None else None)
columns = []
+ # Extract perturbation defaults once
+ perturb = lme_perturbations or {}
+ tau_mults = perturb.get("tau_squared_multipliers", {})
+ re_dist_val = perturb.get("random_effect_dist", "normal")
+ re_df_val = perturb.get("random_effect_df", 5)
+ has_perturb = lme_perturbations is not None
+
for gv, spec in cluster_specs.items():
n_clusters = spec["n_clusters"]
cluster_size = spec["cluster_size"]
@@ -302,19 +310,16 @@ def _generate_cluster_effects(
cluster_size = sample_size // n_clusters
# Apply LME perturbations if present
- if lme_perturbations is not None:
- multiplier = lme_perturbations["tau_squared_multipliers"].get(gv, 1.0)
- tau_sq = tau_sq * multiplier
+ if has_perturb:
+ tau_sq = tau_sq * tau_mults.get(gv, 1.0)
tau = np.sqrt(tau_sq)
# Generate random intercepts (possibly non-normal)
- if lme_perturbations is not None:
- re_dist = lme_perturbations.get("random_effect_dist", "normal")
- re_df = lme_perturbations.get("random_effect_df", 5)
- random_intercepts = _generate_non_normal_intercepts(n_clusters, tau, re_dist, re_df)
+ if has_perturb:
+ random_intercepts = _generate_non_normal_intercepts(n_clusters, tau, re_dist_val, re_df_val, rng_state=rng)
else:
- random_intercepts = np.random.normal(0, tau, size=n_clusters)
+ random_intercepts = rng.normal(0, tau, size=n_clusters)
# Create id_effect column: repeat each cluster's intercept
# cluster_id assignment: [0,0,...,0, 1,1,...,1, ..., K-1,K-1,...,K-1]
@@ -413,14 +418,20 @@ def _generate_random_effects(
A :class:`RandomEffectsResult` with intercept columns, slope
contributions, cluster IDs, Z matrices, and nesting metadata.
"""
- if sim_seed is not None:
- np.random.seed(sim_seed + 3)
+ rng = np.random.RandomState(sim_seed + 4 if sim_seed is not None else None)
intercept_cols: List[np.ndarray] = []
slope_contribution = np.zeros(sample_size)
cluster_ids_dict: Dict[str, np.ndarray] = {}
Z_matrices: Dict[str, np.ndarray] = {}
+ # Extract perturbation defaults once (avoids repeated dict lookups)
+ perturb = lme_perturbations or {}
+ tau_multipliers = perturb.get("tau_squared_multipliers", {})
+ re_dist = perturb.get("random_effect_dist", "normal")
+ re_df = perturb.get("random_effect_df", 5)
+ has_perturbations = lme_perturbations is not None
+
# Nested model bookkeeping
child_to_parent: Optional[np.ndarray] = None
K_parent = 0
@@ -460,19 +471,16 @@ def _generate_random_effects(
# Apply LME perturbations: ICC jitter on tau_squared
tau_sq = spec["tau_squared"]
- if lme_perturbations is not None:
- multiplier = lme_perturbations["tau_squared_multipliers"].get(gv, 1.0)
- tau_sq = tau_sq * multiplier
+ if has_perturbations:
+ tau_sq = tau_sq * tau_multipliers.get(gv, 1.0)
if q == 1:
# --- Random intercept only ---
tau = np.sqrt(tau_sq)
- if lme_perturbations is not None:
- re_dist = lme_perturbations.get("random_effect_dist", "normal")
- re_df = lme_perturbations.get("random_effect_df", 5)
- random_intercepts = _generate_non_normal_intercepts(n_clusters, tau, re_dist, re_df)
+ if has_perturbations:
+ random_intercepts = _generate_non_normal_intercepts(n_clusters, tau, re_dist, re_df, rng_state=rng)
else:
- random_intercepts = np.random.normal(0, tau, size=n_clusters)
+ random_intercepts = rng.normal(0, tau, size=n_clusters)
id_effect = _trim_or_pad(np.repeat(random_intercepts, cluster_size), sample_size)
intercept_cols.append(id_effect)
@@ -482,33 +490,29 @@ def _generate_random_effects(
slope_vars = spec.get("random_slope_vars", [])
# Apply ICC jitter to G_matrix intercept variance
- if lme_perturbations is not None:
+ if has_perturbations:
ratio = tau_sq / spec["tau_squared"] if spec["tau_squared"] > 0 else 1.0
# Scale intercept row/column of G by sqrt(ratio)
sqrt_ratio = np.sqrt(ratio)
G_matrix[0, :] *= sqrt_ratio
G_matrix[:, 0] *= sqrt_ratio
- # Draw correlated [b_int, b_slope1, ...] per cluster
- re_dist = lme_perturbations.get("random_effect_dist", "normal") if lme_perturbations else "normal"
- re_df = lme_perturbations.get("random_effect_df", 5) if lme_perturbations else 5
-
- if re_dist == "heavy_tailed" and lme_perturbations is not None:
+ if re_dist == "heavy_tailed" and has_perturbations:
# Multivariate t: MVN(0, G * (df-2)/df) × sqrt(df / chi2(df))
df = max(re_df, 3)
G_scaled = G_matrix * ((df - 2.0) / df)
- b_normal = np.random.multivariate_normal(np.zeros(q), G_scaled, size=n_clusters)
- chi2_samples = np.random.chisquare(df, size=n_clusters)
+ b_normal = rng.multivariate_normal(np.zeros(q), G_scaled, size=n_clusters)
+ chi2_samples = rng.chisquare(df, size=n_clusters)
mixing = np.sqrt(df / chi2_samples)
b = b_normal * mixing[:, np.newaxis]
- elif re_dist == "skewed" and lme_perturbations is not None:
+ elif re_dist == "skewed" and has_perturbations:
# Independent skewed marginals via shifted chi-squared, scaled by Cholesky
df = max(re_df, 3)
L = np.linalg.cholesky(G_matrix)
- raw = (np.random.chisquare(df, size=(n_clusters, q)) - df) / np.sqrt(2 * df)
+ raw = (rng.chisquare(df, size=(n_clusters, q)) - df) / np.sqrt(2 * df)
b = raw @ L.T
else:
- b = np.random.multivariate_normal(np.zeros(q), G_matrix, size=n_clusters)
+ b = rng.multivariate_normal(np.zeros(q), G_matrix, size=n_clusters)
# Intercept component
intercept_effect = _trim_or_pad(np.repeat(b[:, 0], cluster_size), sample_size)
@@ -549,21 +553,19 @@ def _generate_random_effects(
tau_sq_parent = p_spec["tau_squared"]
tau_sq_child = c_spec["tau_squared"]
- if lme_perturbations is not None:
- tau_sq_parent *= lme_perturbations["tau_squared_multipliers"].get(p_gv, 1.0)
- tau_sq_child *= lme_perturbations["tau_squared_multipliers"].get(c_gv, 1.0)
+ if has_perturbations:
+ tau_sq_parent *= tau_multipliers.get(p_gv, 1.0)
+ tau_sq_child *= tau_multipliers.get(c_gv, 1.0)
tau_parent = np.sqrt(tau_sq_parent)
tau_child = np.sqrt(tau_sq_child)
- if lme_perturbations is not None:
- re_dist = lme_perturbations.get("random_effect_dist", "normal")
- re_df = lme_perturbations.get("random_effect_df", 5)
- b_parent = _generate_non_normal_intercepts(K_parent, tau_parent, re_dist, re_df)
- b_child = _generate_non_normal_intercepts(K_child, tau_child, re_dist, re_df)
+ if has_perturbations:
+ b_parent = _generate_non_normal_intercepts(K_parent, tau_parent, re_dist, re_df, rng_state=rng)
+ b_child = _generate_non_normal_intercepts(K_child, tau_child, re_dist, re_df, rng_state=rng)
else:
- b_parent = np.random.normal(0, tau_parent, size=K_parent)
- b_child = np.random.normal(0, tau_child, size=K_child)
+ b_parent = rng.normal(0, tau_parent, size=K_parent)
+ b_child = rng.normal(0, tau_child, size=K_child)
# IDs: parent_ids assigns each observation to a parent cluster,
# child_ids assigns each observation to a child cluster.
diff --git a/mcpower/stats/distributions.py b/mcpower/stats/distributions.py
index cf13bca..6ee5265 100644
--- a/mcpower/stats/distributions.py
+++ b/mcpower/stats/distributions.py
@@ -3,9 +3,7 @@
Provides F, t, chi2, normal, and studentized range distribution
functions plus batch critical-value computation and table generation.
-Backend priority:
- 1. C++ native (Boost.Math + R Tukey port) via mcpower_native
- 2. scipy (optional shim, for when C++ is not compiled)
+All functions are provided by the C++ native backend (Boost.Math + R Tukey port).
Usage:
from mcpower.stats.distributions import norm_ppf, compute_critical_values_ols
@@ -14,13 +12,11 @@
import numpy as np
# ============================================================================
-# Backend selection
+# Backend — native C++ only
# ============================================================================
-_BACKEND = None
-
try:
- from mcpower.backends.mcpower_native import ( # type: ignore[import]
+ from mcpower.backends.mcpower_native import ( # type: ignore[import] # noqa: F401
chi2_cdf,
chi2_ppf,
compute_critical_values_lme,
@@ -36,171 +32,17 @@
t_ppf,
)
- _BACKEND = "native"
-
-except ImportError:
- # -------------------------------------------------------------------
- # scipy shim -- temporary fallback for when C++ is not compiled.
- # Will be removed when Python fallback backends are fully dropped.
- # -------------------------------------------------------------------
- try:
- from scipy.stats import ( # isort: skip
- chi2 as _chi2_dist,
- f as _f_dist,
- norm as _norm_dist,
- studentized_range as _sr_dist,
- t as _t_dist,
- )
-
- def norm_ppf(p): # noqa: F811
- """Standard normal quantile function (inverse CDF)."""
- return float(_norm_dist.ppf(p))
-
- def norm_cdf(x): # noqa: F811
- """Standard normal CDF."""
- return float(_norm_dist.cdf(x))
-
- def t_ppf(p, df): # noqa: F811
- """Student's t quantile function."""
- return float(_t_dist.ppf(p, df))
-
- def f_ppf(p, dfn, dfd): # noqa: F811
- """Fisher F quantile function."""
- return float(_f_dist.ppf(p, dfn, dfd))
-
- def chi2_ppf(p, df): # noqa: F811
- """Chi-squared quantile function."""
- return float(_chi2_dist.ppf(p, df))
-
- def chi2_cdf(x, df): # noqa: F811
- """Chi-squared CDF."""
- return float(_chi2_dist.cdf(x, df))
-
- def studentized_range_ppf(p, k, df): # noqa: F811
- """Studentized range quantile (Tukey). k=groups, df=denom df."""
- if df < 2 or k < 2 or k > 200 or p <= 0.0 or p >= 1.0:
- return float("inf")
- return float(_sr_dist.ppf(p, k, df))
-
- def compute_critical_values_ols(alpha, dfn, dfd, n_targets, correction_method): # noqa: F811
- """Compute OLS critical values using scipy (fallback).
-
- Args:
- alpha: Significance level.
- dfn: Numerator degrees of freedom (number of predictors).
- dfd: Denominator degrees of freedom (n - p - 1).
- n_targets: Number of individual effects being tested.
- correction_method: 0=none, 1=Bonferroni, 2=FDR (BH), 3=Holm.
-
- Returns:
- Tuple of (f_crit, t_crit, correction_t_crits) where
- correction_t_crits is an ndarray of length n_targets.
- """
- if dfd <= 0:
- return np.inf, np.inf, np.full(max(n_targets, 1), np.inf)
-
- f_crit = _f_dist.ppf(1 - alpha, dfn, dfd) if dfn > 0 else np.inf
- t_crit = _t_dist.ppf(1 - alpha / 2, dfd)
-
- m = n_targets
- if m == 0:
- return f_crit, t_crit, np.empty(0)
-
- if correction_method == 0: # None
- correction_t_crits = np.full(m, t_crit)
- elif correction_method == 1: # Bonferroni
- bonf_crit = _t_dist.ppf(1 - alpha / (2 * m), dfd)
- correction_t_crits = np.full(m, bonf_crit)
- elif correction_method == 2: # FDR (Benjamini-Hochberg)
- correction_t_crits = np.array(
- [_t_dist.ppf(1 - (k + 1) / m * alpha / 2, dfd) if (k + 1) / m * alpha / 2 >= 1e-12 else np.inf for k in range(m)]
- )
- elif correction_method == 3: # Holm
- correction_t_crits = np.array(
- [_t_dist.ppf(1 - alpha / (2 * (m - k)), dfd) if alpha / (2 * (m - k)) >= 1e-12 else np.inf for k in range(m)]
- )
- else:
- correction_t_crits = np.full(m, t_crit)
-
- return f_crit, t_crit, correction_t_crits
-
- def compute_tukey_critical_value(alpha, n_levels, dfd): # noqa: F811
- """Compute Tukey HSD critical value (q / sqrt(2))."""
- if dfd <= 0:
- return np.inf
- q_crit = _sr_dist.ppf(1 - alpha, n_levels, dfd)
- return q_crit / np.sqrt(2)
-
- def compute_critical_values_lme(alpha, n_fixed, n_targets, correction_method): # noqa: F811
- """Compute LME critical values using scipy (fallback).
-
- Args:
- alpha: Significance level.
- n_fixed: Number of fixed effects (excluding intercept).
- n_targets: Number of individual effects being tested.
- correction_method: 0=none, 1=Bonferroni, 2=FDR (BH), 3=Holm.
-
- Returns:
- Tuple of (chi2_crit, z_crit, correction_z_crits) where
- correction_z_crits is an ndarray of length n_targets.
- """
- chi2_crit = _chi2_dist.ppf(1 - alpha, n_fixed) if n_fixed > 0 else np.inf
- z_crit = _norm_dist.ppf(1 - alpha / 2)
-
- m = n_targets
- if m == 0:
- return chi2_crit, z_crit, np.empty(0)
-
- if correction_method == 0: # None
- correction_z_crits = np.full(m, z_crit)
- elif correction_method == 1: # Bonferroni
- bonf = _norm_dist.ppf(1 - alpha / (2 * m))
- correction_z_crits = np.full(m, bonf)
- elif correction_method == 2: # FDR (Benjamini-Hochberg)
- correction_z_crits = np.array(
- [_norm_dist.ppf(1 - (k + 1) / m * alpha / 2) if (k + 1) / m * alpha / 2 >= 1e-12 else np.inf for k in range(m)]
- )
- elif correction_method == 3: # Holm
- correction_z_crits = np.array(
- [_norm_dist.ppf(1 - alpha / (2 * (m - k))) if alpha / (2 * (m - k)) >= 1e-12 else np.inf for k in range(m)]
- )
- else:
- correction_z_crits = np.full(m, z_crit)
-
- return chi2_crit, z_crit, correction_z_crits
-
- def generate_norm_cdf_table(x_min, x_max, resolution): # noqa: F811
- """Generate normal CDF lookup table."""
- x = np.linspace(x_min, x_max, resolution)
- return _norm_dist.cdf(x).astype(np.float64)
-
- def generate_t3_ppf_table(perc_min, perc_max, resolution): # noqa: F811
- """Generate t(3) PPF lookup table (divided by sqrt(3))."""
- p = np.linspace(perc_min, perc_max, resolution)
- return (_t_dist.ppf(p, 3) / np.sqrt(3)).astype(np.float64)
-
- def norm_ppf_array(percentiles): # noqa: F811
- """Vectorized normal PPF for percentile array."""
- return _norm_dist.ppf(np.asarray(percentiles)).astype(np.float64)
-
- _BACKEND = "scipy"
-
- except ImportError as exc:
- raise ImportError(
- "No distribution backend available. "
- "Install from PyPI for prebuilt C++ wheels: pip install MCPower\n"
- "Or install scipy as fallback: pip install scipy"
- ) from exc
+except ImportError as exc:
+ raise ImportError("Native C++ backend not available. Install from PyPI for prebuilt wheels: pip install MCPower") from exc
# ============================================================================
-# Also re-export scipy optimizer shims for lme_solver.py
-# These replace scipy.optimize.minimize and minimize_scalar
+# Optimizer wrappers for lme_solver.py
# ============================================================================
def minimize_lbfgsb(objective, x0, bounds, maxiter=200, ftol=1e-10, gtol=1e-6):
- """L-BFGS-B minimization -- C++ native or scipy fallback.
+ """L-BFGS-B minimization via native C++ backend.
Args:
objective: Callable f(x) -> float
@@ -213,53 +55,15 @@ def minimize_lbfgsb(objective, x0, bounds, maxiter=200, ftol=1e-10, gtol=1e-6):
Returns:
Object with .x (optimal point), .fun (optimal value), .converged (bool)
"""
- if _BACKEND == "native":
- try:
- from mcpower.backends.mcpower_native import lbfgsb_minimize_fd # type: ignore[import]
-
- lb = np.array([b[0] for b in bounds])
- ub = np.array([b[1] for b in bounds])
- return lbfgsb_minimize_fd(objective, np.asarray(x0, dtype=np.float64), lb, ub, maxiter, ftol, gtol)
- except ImportError:
- import warnings
-
- warnings.warn(
- "Native L-BFGS-B optimizer not available despite native backend being loaded. Falling back to scipy.",
- RuntimeWarning,
- stacklevel=2,
- )
- except Exception as e:
- import warnings
+ from mcpower.backends.mcpower_native import lbfgsb_minimize_fd # type: ignore[import]
- warnings.warn(
- f"Native L-BFGS-B optimizer failed ({type(e).__name__}: {e}), falling back to scipy.",
- RuntimeWarning,
- stacklevel=2,
- )
-
- # scipy fallback
- from scipy.optimize import minimize
-
- result = minimize(
- objective,
- x0,
- method="L-BFGS-B",
- bounds=bounds,
- options={"maxiter": maxiter, "ftol": ftol, "gtol": gtol},
- )
-
- class _Result:
- __slots__ = ("x", "fun", "converged")
-
- r = _Result()
- r.x = result.x
- r.fun = result.fun
- r.converged = result.success
- return r
+ lb = np.array([b[0] for b in bounds])
+ ub = np.array([b[1] for b in bounds])
+ return lbfgsb_minimize_fd(objective, np.asarray(x0, dtype=np.float64), lb, ub, maxiter, ftol, gtol)
def minimize_scalar_brent(objective, bounds, tol=1e-8, maxiter=150):
- """Brent 1D minimization -- C++ native or scipy fallback.
+ """Brent 1D minimization via native C++ backend.
Args:
objective: Callable f(x) -> float
@@ -270,43 +74,6 @@ def minimize_scalar_brent(objective, bounds, tol=1e-8, maxiter=150):
Returns:
Object with .x (optimal point), .fun (optimal value), .converged (bool)
"""
- if _BACKEND == "native":
- try:
- from mcpower.backends.mcpower_native import brent_minimize_scalar # type: ignore[import]
-
- return brent_minimize_scalar(objective, bounds[0], bounds[1], tol, maxiter)
- except ImportError:
- import warnings
-
- warnings.warn(
- "Native Brent optimizer not available despite native backend being loaded. Falling back to scipy.",
- RuntimeWarning,
- stacklevel=2,
- )
- except Exception as e:
- import warnings
-
- warnings.warn(
- f"Native Brent optimizer failed ({type(e).__name__}: {e}), falling back to scipy.",
- RuntimeWarning,
- stacklevel=2,
- )
-
- # scipy fallback
- from scipy.optimize import minimize_scalar
-
- result = minimize_scalar(
- objective,
- bounds=bounds,
- method="bounded",
- options={"xatol": tol, "maxiter": maxiter},
- )
-
- class _Result:
- __slots__ = ("x", "fun", "converged")
+ from mcpower.backends.mcpower_native import brent_minimize_scalar # type: ignore[import]
- r = _Result()
- r.x = result.x
- r.fun = result.fun
- r.converged = bool(getattr(result, "success", True))
- return r
+ return brent_minimize_scalar(objective, bounds[0], bounds[1], tol, maxiter)
diff --git a/mcpower/stats/mixed_models.py b/mcpower/stats/mixed_models.py
index 414e911..07de5e8 100644
--- a/mcpower/stats/mixed_models.py
+++ b/mcpower/stats/mixed_models.py
@@ -11,10 +11,12 @@
import threading
import warnings
-from typing import Any, Dict, List, Optional, Union
+from typing import Any, Dict, Optional, Union
import numpy as np
+from ..backends.native import _prep
+
# Suppress statsmodels convergence warnings (expected with small samples/low ICC).
# Module-level filterwarnings with module= is unreliable for statsmodels internals,
# so we also use catch_warnings() context managers around .fit() calls below.
@@ -30,7 +32,6 @@ def _lme_analysis_wrapper(
y: np.ndarray,
target_indices: np.ndarray,
cluster_ids: np.ndarray,
- cluster_column_indices: List[int],
correction_method: int,
alpha: float,
backend: str = "custom",
@@ -51,7 +52,6 @@ def _lme_analysis_wrapper(
y: (n,) response vector
target_indices: Coefficient indices to test (fixed effects only)
cluster_ids: (n,) cluster membership array [0,0,0, 1,1,1, ...]
- cluster_column_indices: Indices of cluster effect columns (unused)
correction_method: 0=none, 1=Bonferroni, 2=FDR, 3=Holm
alpha: Significance level
backend: "custom" (default) or "statsmodels" (fallback)
@@ -118,9 +118,7 @@ def _lme_analysis_wrapper(
verbose=verbose,
)
elif backend == "statsmodels":
- return _lme_analysis_statsmodels(
- X_expanded, y, target_indices, cluster_ids, cluster_column_indices, correction_method, alpha, verbose
- )
+ return _lme_analysis_statsmodels(X_expanded, y, target_indices, cluster_ids, correction_method, alpha, verbose)
else:
raise ValueError(f"Unknown backend: {backend}")
@@ -130,7 +128,6 @@ def _lme_analysis_statsmodels(
y: np.ndarray,
target_indices: np.ndarray,
cluster_ids: np.ndarray,
- cluster_column_indices: List[int],
correction_method: int,
alpha: float,
verbose: bool = False,
@@ -145,11 +142,10 @@ def _lme_analysis_statsmodels(
- Convergence retry strategy (allows ≤3% failures)
Args:
- X_expanded: (n, p) design matrix (includes cluster effect columns)
+ X_expanded: (n, p) design matrix (excludes cluster effect columns)
y: (n,) response vector
target_indices: Coefficient indices to test (fixed effects only)
cluster_ids: (n,) cluster membership array
- cluster_column_indices: Indices of cluster effect columns to remove
correction_method: 0=none, 1=Bonferroni, 2=FDR, 3=Holm
alpha: Significance level
verbose: Return detailed diagnostics
@@ -172,8 +168,6 @@ def _lme_analysis_statsmodels(
n, p = X_expanded.shape
n_targets = len(target_indices)
- # Note: X_expanded already excludes cluster effects (they're not in the design matrix)
- # cluster_column_indices is now unused in this function but kept for API compatibility
X_fixed = X_expanded
# Step 1: Add intercept to fixed effects
@@ -530,6 +524,29 @@ def _compute_wald_test(result, alpha):
return results_array
+def _ensure_lme_crits(alpha, p, n_targets, correction_method, chi2_crit, z_crit, correction_z_crits):
+ """Compute LME critical values on-the-fly if not precomputed."""
+ if z_crit is None or chi2_crit is None or correction_z_crits is None:
+ from .lme_solver import compute_lme_critical_values
+
+ return compute_lme_critical_values(alpha, p, n_targets, correction_method)
+ return chi2_crit, z_crit, correction_z_crits
+
+
+def _wrap_native_result(result, verbose, solver_name, extra_diag=None) -> Optional[Union[np.ndarray, Dict]]:
+ """Wrap C++ solver result with optional verbose diagnostics."""
+ if len(result) > 0:
+ if verbose:
+ diag = {"solver": solver_name}
+ if extra_diag:
+ diag.update(extra_diag)
+ return {"results": result, "diagnostics": diag}
+ return np.asarray(result)
+ if verbose:
+ return {"results": None, "failure_reason": f"C++ {solver_name} returned empty result"}
+ return None
+
+
def _lme_analysis_custom(
X_expanded: np.ndarray,
y: np.ndarray,
@@ -545,39 +562,29 @@ def _lme_analysis_custom(
"""LME analysis for random-intercept models via C++ backend.
Uses precomputed critical values (chi2_crit, z_crit) to avoid
- per-simulation scipy calls. Falls back to computing them if not provided.
+ per-simulation distribution calls. Falls back to computing them if not provided.
"""
n, p = X_expanded.shape
n_targets = len(target_indices)
K = int(cluster_ids.max()) + 1
- if z_crit is None or chi2_crit is None or correction_z_crits is None:
- from .lme_solver import compute_lme_critical_values
-
- chi2_crit, z_crit, correction_z_crits = compute_lme_critical_values(alpha, p, n_targets, correction_method)
+ chi2_crit, z_crit, correction_z_crits = _ensure_lme_crits(alpha, p, n_targets, correction_method, chi2_crit, z_crit, correction_z_crits)
from mcpower.backends import mcpower_native as _native # type: ignore[attr-defined]
result = _native.lme_analysis(
- np.ascontiguousarray(X_expanded, dtype=np.float64),
- np.ascontiguousarray(y, dtype=np.float64),
- np.ascontiguousarray(cluster_ids, dtype=np.int32),
+ _prep(X_expanded),
+ _prep(y),
+ _prep(cluster_ids, np.int32),
K,
- np.ascontiguousarray(target_indices, dtype=np.int32),
+ _prep(target_indices, np.int32),
float(chi2_crit),
float(z_crit),
- np.ascontiguousarray(correction_z_crits, dtype=np.float64),
+ _prep(correction_z_crits),
int(correction_method),
float(-1.0),
)
- if len(result) > 0:
- if verbose:
- return {"results": result, "diagnostics": {"solver": "native_q1"}}
- return result # type: ignore[no-any-return]
-
- if verbose:
- return {"results": None, "failure_reason": "C++ solver returned empty result"}
- return None
+ return _wrap_native_result(result, verbose, "native_q1")
def _lme_analysis_custom_general(
@@ -594,8 +601,6 @@ def _lme_analysis_custom_general(
verbose: bool = False,
) -> Optional[Union[np.ndarray, Dict]]:
"""LME analysis for random slopes (q > 1) via C++ backend."""
- from .lme_solver import compute_lme_critical_values
-
n, p = X_expanded.shape
n_targets = len(target_indices)
@@ -604,34 +609,26 @@ def _lme_analysis_custom_general(
q = Z.shape[1]
K = int(cluster_ids.max()) + 1
- if z_crit is None or chi2_crit is None or correction_z_crits is None:
- chi2_crit, z_crit, correction_z_crits = compute_lme_critical_values(alpha, p, n_targets, correction_method)
+ chi2_crit, z_crit, correction_z_crits = _ensure_lme_crits(alpha, p, n_targets, correction_method, chi2_crit, z_crit, correction_z_crits)
from mcpower.backends import mcpower_native as _native # type: ignore[attr-defined]
warm_theta_arr = np.empty(0, dtype=np.float64)
result = _native.lme_analysis_general(
- np.ascontiguousarray(X_expanded, dtype=np.float64),
- np.ascontiguousarray(y, dtype=np.float64),
- np.ascontiguousarray(Z, dtype=np.float64),
- np.ascontiguousarray(cluster_ids, dtype=np.int32),
+ _prep(X_expanded),
+ _prep(y),
+ _prep(Z),
+ _prep(cluster_ids, np.int32),
K,
q,
- np.ascontiguousarray(target_indices, dtype=np.int32),
+ _prep(target_indices, np.int32),
float(chi2_crit),
float(z_crit),
- np.ascontiguousarray(correction_z_crits, dtype=np.float64),
+ _prep(correction_z_crits),
int(correction_method),
warm_theta_arr,
)
- if len(result) > 0:
- if verbose:
- return {"results": result, "diagnostics": {"solver": "native_general", "q": q}}
- return result # type: ignore[no-any-return]
-
- if verbose:
- return {"results": None, "failure_reason": "C++ general solver returned empty result"}
- return None
+ return _wrap_native_result(result, verbose, "native_general", extra_diag={"q": q})
def _lme_analysis_custom_nested(
@@ -647,8 +644,6 @@ def _lme_analysis_custom_nested(
verbose: bool = False,
) -> Optional[Union[np.ndarray, Dict]]:
"""LME analysis for nested random intercepts via C++ backend."""
- from .lme_solver import compute_lme_critical_values
-
n, p = X_expanded.shape
n_targets = len(target_indices)
@@ -658,35 +653,27 @@ def _lme_analysis_custom_nested(
K_child = re_result.K_child
child_to_parent = re_result.child_to_parent
- if z_crit is None or chi2_crit is None or correction_z_crits is None:
- chi2_crit, z_crit, correction_z_crits = compute_lme_critical_values(alpha, p, n_targets, correction_method)
+ chi2_crit, z_crit, correction_z_crits = _ensure_lme_crits(alpha, p, n_targets, correction_method, chi2_crit, z_crit, correction_z_crits)
from mcpower.backends import mcpower_native as _native # type: ignore[attr-defined]
warm_theta_arr = np.empty(0, dtype=np.float64)
result = _native.lme_analysis_nested(
- np.ascontiguousarray(X_expanded, dtype=np.float64),
- np.ascontiguousarray(y, dtype=np.float64),
- np.ascontiguousarray(parent_ids, dtype=np.int32),
- np.ascontiguousarray(child_ids, dtype=np.int32),
+ _prep(X_expanded),
+ _prep(y),
+ _prep(parent_ids, np.int32),
+ _prep(child_ids, np.int32),
K_parent,
K_child,
- np.ascontiguousarray(child_to_parent, dtype=np.int32),
- np.ascontiguousarray(target_indices, dtype=np.int32),
+ _prep(child_to_parent, np.int32),
+ _prep(target_indices, np.int32),
float(chi2_crit),
float(z_crit),
- np.ascontiguousarray(correction_z_crits, dtype=np.float64),
+ _prep(correction_z_crits),
int(correction_method),
warm_theta_arr,
)
- if len(result) > 0:
- if verbose:
- return {"results": result, "diagnostics": {"solver": "native_nested", "K_parent": K_parent, "K_child": K_child}}
- return result # type: ignore[no-any-return]
-
- if verbose:
- return {"results": None, "failure_reason": "C++ nested solver returned empty result"}
- return None
+ return _wrap_native_result(result, verbose, "native_nested", extra_diag={"K_parent": K_parent, "K_child": K_child})
def reset_warm_start_cache():
diff --git a/mcpower/tables/lookup.py b/mcpower/tables/lookup.py
index b66fefa..5f4d691 100644
--- a/mcpower/tables/lookup.py
+++ b/mcpower/tables/lookup.py
@@ -15,7 +15,7 @@ class LookupTableManager:
"""Manages pre-computed lookup tables for data-generation transforms.
Tables are lazily loaded from disk (``tables/data/*.npz``) on first
- access and generated from scipy if the cache files are missing.
+ access and generated via the C++ native backend if the cache files are missing.
The C++ native backend consumes these tables for distribution
transforms.
@@ -47,47 +47,37 @@ def ensure_data_dir(self) -> None:
"""Ensure data directory exists."""
self.data_dir.mkdir(parents=True, exist_ok=True)
- def load_norm_cdf_table(self) -> np.ndarray:
- """Load (or generate and cache) the normal CDF lookup table.
+ def _load_table(self, key: str, generate_fn) -> np.ndarray:
+ """Load a table from cache, disk, or generate it on the fly.
+
+ Args:
+ key: Cache key and npz array name (e.g. ``"norm_cdf"``).
+ generate_fn: Bound method to generate and cache the table.
Returns:
1-D float64 array of length ``DIST_RESOLUTION``.
"""
- if "norm_cdf" in self._tables:
- return self._tables["norm_cdf"]
-
- cache_file = self.data_dir / "norm_cdf.npz"
+ if key in self._tables:
+ return self._tables[key]
+ cache_file = self.data_dir / f"{key}.npz"
try:
data = np.load(cache_file)
- self._tables["norm_cdf"] = data["norm_cdf"]
- return self._tables["norm_cdf"]
+ self._tables[key] = data[key]
+ return self._tables[key]
except (FileNotFoundError, KeyError):
pass
- self._generate_norm_cdf_table()
- return self._tables["norm_cdf"]
-
- def load_t3_ppf_table(self) -> np.ndarray:
- """Load (or generate and cache) the t(df=3) PPF lookup table.
+ generate_fn()
+ return self._tables[key]
- Returns:
- 1-D float64 array of length ``DIST_RESOLUTION``.
- """
- if "t3_ppf" in self._tables:
- return self._tables["t3_ppf"]
-
- cache_file = self.data_dir / "t3_ppf.npz"
-
- try:
- data = np.load(cache_file)
- self._tables["t3_ppf"] = data["t3_ppf"]
- return self._tables["t3_ppf"]
- except (FileNotFoundError, KeyError):
- pass
+ def load_norm_cdf_table(self) -> np.ndarray:
+ """Load (or generate and cache) the normal CDF lookup table."""
+ return self._load_table("norm_cdf", self._generate_norm_cdf_table)
- self._generate_t3_ppf_table()
- return self._tables["t3_ppf"]
+ def load_t3_ppf_table(self) -> np.ndarray:
+ """Load (or generate and cache) the t(df=3) PPF lookup table."""
+ return self._load_table("t3_ppf", self._generate_t3_ppf_table)
def load_all_generation_tables(self) -> Tuple[np.ndarray, np.ndarray]:
"""
@@ -110,6 +100,8 @@ def _generate_norm_cdf_table(self) -> None:
self._tables["norm_cdf"] = norm_cdf
self.ensure_data_dir()
+ # Silently ignore cache write failures (e.g. read-only filesystem,
+ # permission denied). Tables are still usable from memory.
try:
np.savez_compressed(self.data_dir / "norm_cdf.npz", norm_cdf=norm_cdf, x_range=x_norm)
except Exception:
@@ -125,6 +117,8 @@ def _generate_t3_ppf_table(self) -> None:
self._tables["t3_ppf"] = t3_ppf
self.ensure_data_dir()
+ # Silently ignore cache write failures (e.g. read-only filesystem,
+ # permission denied). Tables are still usable from memory.
try:
np.savez_compressed(
self.data_dir / "t3_ppf.npz",
diff --git a/mcpower/utils/formatters.py b/mcpower/utils/formatters.py
index b8bd0b0..8bc91b3 100644
--- a/mcpower/utils/formatters.py
+++ b/mcpower/utils/formatters.py
@@ -6,6 +6,7 @@
"""
import math
+from itertools import combinations
from typing import Any, Dict, List, Optional
import numpy as np
@@ -13,6 +14,11 @@
__all__ = []
+def _is_nan(value) -> bool:
+ """Check if a value is NaN (float type check + math.isnan)."""
+ return isinstance(value, float) and math.isnan(value)
+
+
class _TableFormatter:
"""Static helpers for building fixed-width text tables."""
@@ -25,7 +31,10 @@ def _create_table(
"""Create formatted table with headers and rows."""
if not col_widths:
- col_widths = [max(len(str(h)), max(len(str(row[i])) + 2 for row in rows)) for i, h in enumerate(headers)]
+ if rows:
+ col_widths = [max(len(str(h)), max(len(str(row[i])) + 2 for row in rows)) for i, h in enumerate(headers)]
+ else:
+ col_widths = [len(str(h)) for h in headers]
lines = []
@@ -131,7 +140,7 @@ def _format_short_power(self, data: Dict) -> str:
for test in model["target_tests"]:
power_corr = results["individual_powers_corrected"][test]
- if isinstance(power_corr, float) and math.isnan(power_corr):
+ if _is_nan(power_corr):
rows_corrected.append([test, "-", f"{target:.0f}", "-"])
else:
status = "✓" if power_corr >= target else "✗"
@@ -162,7 +171,7 @@ def _format_long_power(self, data: Dict) -> str:
power = results["individual_powers"][test]
power_corr = results.get("individual_powers_corrected", {}).get(test, power)
target = model.get("target_power", 80.0)
- if isinstance(power_corr, float) and math.isnan(power_corr):
+ if _is_nan(power_corr):
rows.append([test, f"{power:.2f}", "-", f"{target:.1f}", "-"])
else:
achieved = "✓" if power_corr >= target else "✗"
@@ -331,13 +340,15 @@ def _format_scenario_power_short(self, scenarios: Dict, target_tests: List[str],
lines = [f"\n{'=' * 80}", "SCENARIO SUMMARY", f"{'=' * 80}"]
+ scenario_names = list(scenarios.keys())
+ headers = ["Test"] + [name.title() for name in scenario_names]
+ col_widths = [40] + [12] * len(scenario_names)
+
# Uncorrected table
- headers = ["Test", "Optimistic", "Realistic", "Doomer"]
rows = []
-
for test in target_tests:
row = [test]
- for scenario in ["optimistic", "realistic", "doomer"]:
+ for scenario in scenario_names:
if scenario in scenarios and "results" in scenarios[scenario]:
power = scenarios[scenario]["results"]["individual_powers"][test]
row.append(f"{power:.1f}")
@@ -346,17 +357,17 @@ def _format_scenario_power_short(self, scenarios: Dict, target_tests: List[str],
rows.append(row)
lines.append("\nUncorrected Power:")
- lines.append(self._table._create_table(headers, rows, [40, 12, 12, 12]))
+ lines.append(self._table._create_table(headers, rows, col_widths))
# Corrected table if applicable
if correction:
rows_corr = []
for test in target_tests:
row = [test]
- for scenario in ["optimistic", "realistic", "doomer"]:
+ for scenario in scenario_names:
if scenario in scenarios and "results" in scenarios[scenario]:
power_corr = scenarios[scenario]["results"]["individual_powers_corrected"][test]
- if isinstance(power_corr, float) and math.isnan(power_corr):
+ if _is_nan(power_corr):
row.append("-")
else:
row.append(f"{power_corr:.1f}")
@@ -365,7 +376,7 @@ def _format_scenario_power_short(self, scenarios: Dict, target_tests: List[str],
rows_corr.append(row)
lines.append(f"\nCorrected Power ({correction}):")
- lines.append(self._table._create_table(headers, rows_corr, [40, 12, 12, 12]))
+ lines.append(self._table._create_table(headers, rows_corr, col_widths))
lines.append(f"{'=' * 80}")
@@ -395,74 +406,74 @@ def _format_scenario_power_long(
lines.append("DETAILED SCENARIO RESULTS")
lines.append(f"{'=' * 80}")
- for scenario_name in ["optimistic", "realistic", "doomer"]:
- if scenario_name in scenarios:
- lines.append(f"\n{'-' * 80}")
- lines.append(f"{scenario_name.upper()} SCENARIO")
- lines.append(f"{'-' * 80}")
-
- # Use regular power formatter for each scenario
- scenario_data = {
- "model": scenarios[scenario_name]["model"],
- "results": scenarios[scenario_name]["results"],
- }
- lines.append(self._format_long_power(scenario_data))
-
- # 3. Comparison analysis
- lines.append(f"\n{'=' * 80}")
- lines.append("ROBUSTNESS ANALYSIS")
- lines.append(f"{'=' * 80}")
-
- # Power reduction table
- headers = ["Test", "Opt→Real Drop", "Opt→Doom Drop", "Vulnerability"]
- rows = []
- vulnerable_tests = []
- inflated_tests = []
+ for scenario_name in scenarios:
+ lines.append(f"\n{'-' * 80}")
+ lines.append(f"{scenario_name.upper()} SCENARIO")
+ lines.append(f"{'-' * 80}")
+
+ scenario_data = {
+ "model": scenarios[scenario_name]["model"],
+ "results": scenarios[scenario_name]["results"],
+ }
+ lines.append(self._format_long_power(scenario_data))
+
+ # 3. Comparison analysis — compare each non-optimistic scenario to optimistic
+ if "optimistic" in scenarios and len(scenarios) > 1:
+ lines.append(f"\n{'=' * 80}")
+ lines.append("ROBUSTNESS ANALYSIS")
+ lines.append(f"{'=' * 80}")
+
+ other_scenarios = [s for s in scenarios if s != "optimistic"]
+ headers = ["Test"] + [f"Opt→{s.title()} Drop" for s in other_scenarios] + ["Vulnerability"]
+ rows = []
+ vulnerable_tests = []
+ inflated_tests = []
- for test in target_tests:
- opt_power = scenarios["optimistic"]["results"]["individual_powers"][test]
- real_power = scenarios.get("realistic", {}).get("results", {}).get("individual_powers", {}).get(test, opt_power)
- doom_power = scenarios.get("doomer", {}).get("results", {}).get("individual_powers", {}).get(test, opt_power)
-
- real_drop = opt_power - real_power
- doom_drop = opt_power - doom_power
-
- # Format drops with proper signs
- real_drop_str = f"+{abs(real_drop):.1f}%" if real_drop < 0 else f"-{real_drop:.1f}%"
- doom_drop_str = f"+{abs(doom_drop):.1f}%" if doom_drop < 0 else f"-{doom_drop:.1f}%"
-
- # Vulnerability assessment and categorization
- if doom_drop > HIGH_VULNERABILITY_THRESHOLD:
- vulnerability = "HIGH"
- vulnerable_tests.append(test)
- elif doom_drop > MEDIUM_VULNERABILITY_THRESHOLD:
- vulnerability = "MEDIUM"
- elif doom_drop < INFLATED_ERROR_THRESHOLD:
- vulnerability = "INFLATED FALSE POSITIVES"
- inflated_tests.append(test)
- else:
- vulnerability = "LOW"
+ for test in target_tests:
+ opt_power = scenarios["optimistic"]["results"]["individual_powers"][test]
+ row = [test]
+ max_drop = 0.0
+
+ for scenario in other_scenarios:
+ other_power = scenarios.get(scenario, {}).get("results", {}).get("individual_powers", {}).get(test, opt_power)
+ drop = opt_power - other_power
+ max_drop = max(max_drop, drop)
+ drop_str = f"+{abs(drop):.1f}%" if drop < 0 else f"-{drop:.1f}%"
+ row.append(drop_str)
+
+ if max_drop > HIGH_VULNERABILITY_THRESHOLD:
+ vulnerability = "HIGH"
+ vulnerable_tests.append(test)
+ elif max_drop > MEDIUM_VULNERABILITY_THRESHOLD:
+ vulnerability = "MEDIUM"
+ elif max_drop < INFLATED_ERROR_THRESHOLD:
+ vulnerability = "INFLATED FALSE POSITIVES"
+ inflated_tests.append(test)
+ else:
+ vulnerability = "LOW"
- rows.append([test, real_drop_str, doom_drop_str, vulnerability])
+ row.append(vulnerability)
+ rows.append(row)
- lines.append(self._table._create_table(headers, rows))
+ lines.append(self._table._create_table(headers, rows))
# 4. Recommendations
- lines.append(f"\n{'=' * 80}")
- lines.append("RECOMMENDATIONS")
- lines.append(f"{'=' * 80}")
+ if "optimistic" in scenarios and len(scenarios) > 1:
+ lines.append(f"\n{'=' * 80}")
+ lines.append("RECOMMENDATIONS")
+ lines.append(f"{'=' * 80}")
- if vulnerable_tests:
- lines.append(f"• High vulnerability tests: {', '.join(vulnerable_tests)}")
- lines.append("• Consider increasing sample size to maintain power under adverse conditions")
+ if vulnerable_tests:
+ lines.append(f"• High vulnerability tests: {', '.join(vulnerable_tests)}")
+ lines.append("• Consider increasing sample size to maintain power under adverse conditions")
- if inflated_tests:
- lines.append(f"• Inflated false positive risk: {', '.join(inflated_tests)}")
- lines.append("• Be careful about interpretation")
+ if inflated_tests:
+ lines.append(f"• Inflated false positive risk: {', '.join(inflated_tests)}")
+ lines.append("• Be careful about interpretation")
- if not vulnerable_tests and not inflated_tests:
- lines.append("• Power analysis appears robust to assumption violations")
- lines.append("• Original sample size should be sufficient")
+ if not vulnerable_tests and not inflated_tests:
+ lines.append("• Power analysis appears robust to assumption violations")
+ lines.append("• Original sample size should be sufficient")
return "\n".join(lines)
@@ -484,25 +495,22 @@ def _format_scenario_sample_size_short(self, scenarios: Dict, target_tests: List
"""Short scenario sample size summary."""
lines = [f"\n{'=' * 80}", "SCENARIO SUMMARY", f"{'=' * 80}"]
+ scenario_names = list(scenarios.keys())
if correction:
# Combined table with uncorrected and corrected
lines.append("\nSample Size Requirements:")
- headers = [
- "Test",
- "Opt(U)",
- "Opt(C)",
- "Real(U)",
- "Real(C)",
- "Doom(U)",
- "Doom(C)",
- ]
+ headers = ["Test"]
+ for name in scenario_names:
+ abbrev = name[:4].title()
+ headers.extend([f"{abbrev}(U)", f"{abbrev}(C)"])
+ col_widths = [40] + [8] * (len(scenario_names) * 2)
rows = []
for test in target_tests:
- row = [test[:40]] # Truncate to 40 chars
+ row = [test[:40]]
- for scenario in ["optimistic", "realistic", "doomer"]:
+ for scenario in scenario_names:
if scenario in scenarios and "results" in scenarios[scenario]:
n_uncorr = scenarios[scenario]["results"]["first_achieved"][test]
n_corr = scenarios[scenario]["results"]["first_achieved_corrected"][test]
@@ -520,16 +528,17 @@ def _format_scenario_sample_size_short(self, scenarios: Dict, target_tests: List
row.extend(["N/A", "N/A"])
rows.append(row)
- lines.append(self._table._create_table(headers, rows, [40, 8, 8, 8, 8, 8, 8]))
+ lines.append(self._table._create_table(headers, rows, col_widths))
lines.append("Note: (U) = Uncorrected, (C) = Corrected")
else:
# Uncorrected only
- headers = ["Test", "Optimistic", "Realistic", "Doomer"]
+ headers = ["Test"] + [name.title() for name in scenario_names]
+ col_widths = [40] + [12] * len(scenario_names)
rows = []
for test in target_tests:
- row = [test[:40]] # Truncate to 40 chars
- for scenario in ["optimistic", "realistic", "doomer"]:
+ row = [test[:40]]
+ for scenario in scenario_names:
if scenario in scenarios and "results" in scenarios[scenario]:
n_required = scenarios[scenario]["results"]["first_achieved"][test]
if n_required > 0:
@@ -542,7 +551,7 @@ def _format_scenario_sample_size_short(self, scenarios: Dict, target_tests: List
rows.append(row)
lines.append("\nUncorrected Sample Sizes:")
- lines.append(self._table._create_table(headers, rows, [40, 12, 12, 12]))
+ lines.append(self._table._create_table(headers, rows, col_widths))
lines.append(f"{'=' * 80}")
@@ -562,39 +571,33 @@ def _format_scenario_sample_size_long(
# 1. Overall summary
lines.append(self._format_scenario_sample_size_short(scenarios, target_tests, correction))
- # 2. Recommendations
+ # 2. Recommendations — summarize max N per non-optimistic scenario
lines.append(f"\n{'=' * 80}")
lines.append("RECOMMENDATIONS")
lines.append(f"{'=' * 80}")
- # Calculate max required N across scenarios
- max_n_realistic = max(
- (scenarios.get("realistic", {}).get("results", {}).get("first_achieved", {}).get(test, 0) for test in target_tests),
- default=0,
- )
- max_n_doomer = max(
- (scenarios.get("doomer", {}).get("results", {}).get("first_achieved", {}).get(test, 0) for test in target_tests),
- default=0,
- )
-
- max_tested = scenarios.get("realistic", {}).get("model", {}).get("sample_size_range", {}).get("to_size", 200)
-
- if max_n_realistic > 0 and max_n_realistic <= max_tested:
- lines.append(f"• For robust power under realistic conditions: N = {max_n_realistic}")
- elif max_n_realistic <= 0:
- lines.append(f"• For robust power under realistic conditions: N > {max_tested}")
-
- if max_n_doomer > 0 and max_n_doomer <= max_tested:
- lines.append(f"• For power under worst-case conditions: N = {max_n_doomer}")
- elif max_n_doomer <= 0:
- lines.append(f"• For power under worst-case conditions: N > {max_tested}")
-
- # Check if any tests couldn't achieve power
- unachievable = [
- test for test in target_tests if scenarios.get("doomer", {}).get("results", {}).get("first_achieved", {}).get(test, -1) <= 0
- ]
- if unachievable:
- lines.append(f"• Warning: These tests may not achieve target power under adverse conditions: {', '.join(unachievable)}")
+ other_scenarios = [s for s in scenarios if s != "optimistic"]
+ for scenario in other_scenarios:
+ max_n = max(
+ (scenarios.get(scenario, {}).get("results", {}).get("first_achieved", {}).get(test, 0) for test in target_tests),
+ default=0,
+ )
+ max_tested = scenarios.get(scenario, {}).get("model", {}).get("sample_size_range", {}).get("to_size", 200)
+ label = scenario.title()
+
+ if max_n > 0 and max_n <= max_tested:
+ lines.append(f"• For power under {label} conditions: N = {max_n}")
+ elif max_n <= 0:
+ lines.append(f"• For power under {label} conditions: N > {max_tested}")
+
+ # Check unachievable across worst scenario (last non-optimistic)
+ if other_scenarios:
+ worst = other_scenarios[-1]
+ unachievable = [
+ test for test in target_tests if scenarios.get(worst, {}).get("results", {}).get("first_achieved", {}).get(test, -1) <= 0
+ ]
+ if unachievable:
+ lines.append(f"• Warning: These tests may not achieve target power under {worst} conditions: {', '.join(unachievable)}")
# Add cumulative probability analysis
cumulative_lines = self._format_cumulative_recommendations(data, is_scenario=True)
@@ -706,7 +709,7 @@ def _add_cumulative_sample_size_table(
# Filter out tests with NaN power (e.g. non-contrast tests under Tukey correction)
def _has_nan_power(t: str) -> bool:
vals = powers_by_test[t]
- return bool(vals and isinstance(vals[0], float) and math.isnan(vals[0]))
+ return bool(vals and _is_nan(vals[0]))
valid_tests = [t for t in target_tests if not _has_nan_power(t)]
if not valid_tests:
@@ -741,8 +744,6 @@ def _has_nan_power(t: str) -> bool:
else: # ≥k cases
# Approximate using independence assumption
prob_at_least_k = 0.0
- from itertools import combinations
-
# Sum over all ways to choose at least k tests
for num_sig in range(k, n_tests + 1):
for combo in combinations(range(n_tests), num_sig):
@@ -859,6 +860,12 @@ def _format_cumulative_recommendations(self, results: Dict, is_scenario: bool =
if prob >= target_power:
min_n_target = sample_sizes[i]
break
+
+ if min_n_target:
+ lines.append(f"• N={min_n_target} for {target_power:.0f}% chance all tests significant")
+ else:
+ max_tested = sample_sizes[-1]
+ lines.append(f"• >{max_tested} needed for {target_power:.0f}% chance all tests significant")
return lines
diff --git a/mcpower/utils/parsers.py b/mcpower/utils/parsers.py
index e89d140..c14533f 100644
--- a/mcpower/utils/parsers.py
+++ b/mcpower/utils/parsers.py
@@ -105,6 +105,8 @@ def _split_assignments(self, input_string: str) -> List[str]:
paren_count += 1
elif char == ")":
paren_count -= 1
+ if paren_count < 0:
+ raise ValueError("Unbalanced parentheses: unexpected ')'")
current.append(char)
if current:
@@ -424,7 +426,11 @@ def _parse_independent_variables(formula: str) -> Tuple[Dict, Dict]:
"""
from itertools import combinations
- terms = re.split(r"[+\-]", formula)
+ # Check for minus sign (term removal) which is not supported
+ if re.search(r"(? Tuple[List[str], List[Dict]]:
+ """Extract effect names from a test formula, matched against the registry.
+
+ Parses the test formula, expands factor variables to their dummies,
+ and returns the list of effect names (in registry order) that belong
+ to the test formula.
+
+ Args:
+ test_formula: Formula string (e.g. ``"y ~ x1 + x2"``).
+ registry: ``VariableRegistry`` instance.
+
+ Returns:
+ Tuple of ``(effect_names, random_effects)`` where *effect_names*
+ are the registry effect names present in the test formula (in
+ registry order), and *random_effects* is the list of parsed
+ random-effect dicts from the test formula.
+ """
+ _dep_var, fixed_formula, random_effects = _parse_equation(test_formula)
+
+ # Parse fixed effects into a set of term names
+ test_terms = _parse_fixed_terms(fixed_formula)
+
+ # Determine which registry effects belong to the test formula
+ cluster_effects = set(registry.cluster_effect_names)
+ test_effects: List[str] = []
+
+ for effect_name in registry._effects:
+ if effect_name in cluster_effects:
+ continue
+
+ effect = registry._effects[effect_name]
+
+ if effect.effect_type == "main":
+ # Direct match (continuous or interaction-less variable)
+ if effect_name in test_terms:
+ test_effects.append(effect_name)
+ elif effect_name in registry._factor_dummies:
+ # Factor dummy -- include if parent factor is in test terms
+ parent_factor = registry._factor_dummies[effect_name]["factor_name"]
+ if parent_factor in test_terms:
+ test_effects.append(effect_name)
+ else:
+ # Interaction -- check if the interaction term is in test terms
+ if effect_name in test_terms:
+ test_effects.append(effect_name)
+
+ return test_effects, random_effects
+
+
+def _parse_fixed_terms(fixed_formula: str) -> Set[str]:
+ """Parse a fixed-effect formula string into a set of term names.
+
+ Handles ``+`` for additive terms, ``:`` for specific interactions,
+ and ``*`` for full factorial expansion (main effects plus all
+ two-way through n-way interactions).
+
+ Args:
+ fixed_formula: Right-hand side of the equation, spaces already
+ stripped by ``_parse_equation`` (e.g. ``"x1+x2+x1:x2"``).
+
+ Returns:
+ Set of term names (variable names and interaction terms like
+ ``"x1:x2"``).
+ """
+ if not fixed_formula.strip():
+ return set()
+
+ terms: Set[str] = set()
+ raw_terms = re.split(r"\+", fixed_formula)
+
+ for raw in raw_terms:
+ raw = raw.strip()
+ if not raw:
+ continue
+
+ if "*" in raw:
+ # Full factorial: x1*x2 -> x1, x2, x1:x2
+ vars_in_star = [v.strip() for v in raw.split("*") if v.strip()]
+ for v in vars_in_star:
+ terms.add(v)
+ for r in range(2, len(vars_in_star) + 1):
+ for combo in combinations(vars_in_star, r):
+ terms.add(":".join(combo))
+ else:
+ # Plain term (may contain ":" for explicit interaction)
+ terms.add(raw)
+
+ return terms
+
+
+def _compute_test_column_indices(
+ all_effect_names: List[str],
+ test_effect_names: List[str],
+) -> np.ndarray:
+ """Compute column indices in X_expanded for test formula effects.
+
+ Args:
+ all_effect_names: All non-cluster effect names in registry order.
+ test_effect_names: Effect names present in the test formula
+ (a subset of *all_effect_names*).
+
+ Returns:
+ Integer array of column indices into X_expanded.
+ """
+ test_set = set(test_effect_names)
+ indices = [i for i, name in enumerate(all_effect_names) if name in test_set]
+ return np.array(indices, dtype=np.int64)
+
+
+def _remap_target_indices(
+ original_target_indices: np.ndarray,
+ test_column_indices: np.ndarray,
+) -> np.ndarray:
+ """Remap target indices from full X_expanded space to X_test space.
+
+ Args:
+ original_target_indices: Indices in X_expanded being tested.
+ test_column_indices: Columns of X_expanded included in X_test.
+
+ Returns:
+ Indices remapped to positions within X_test.
+ """
+ # Build mapping: full_index -> position in X_test
+ index_map = {int(full_idx): test_idx for test_idx, full_idx in enumerate(test_column_indices)}
+ return np.array(
+ [index_map[int(idx)] for idx in original_target_indices],
+ dtype=np.int64,
+ )
diff --git a/mcpower/utils/updates.py b/mcpower/utils/updates.py
index 8c3c76b..c7f57a5 100644
--- a/mcpower/utils/updates.py
+++ b/mcpower/utils/updates.py
@@ -12,6 +12,8 @@
from datetime import datetime, timedelta
from pathlib import Path
+_already_checked = False
+
def _check_for_updates(current_version):
"""Check PyPI weekly for a newer MCPower version and warn if found.
@@ -20,18 +22,24 @@ def _check_for_updates(current_version):
silently in worker processes (detected via environment variable)
and in frozen (PyInstaller) bundles where pip is unavailable.
"""
+ global _already_checked
# Skip in frozen bundles (PyInstaller) — the GUI has its own update checker
if getattr(sys, "frozen", False):
return
+ # Skip if already checked in this process
+ if _already_checked:
+ return
+
# Skip in worker processes (loky/joblib inherit env vars from parent)
if os.environ.get("_MCPOWER_UPDATE_CHECKED"):
return
os.environ["_MCPOWER_UPDATE_CHECKED"] = "1"
+ _already_checked = True
- cache_path = Path(__file__).parent.parent / ".mcpower_cache.json"
- cache_path.parent.mkdir(exist_ok=True)
+ cache_path = Path.home() / ".cache" / "mcpower" / "update_cache.json"
+ cache_path.parent.mkdir(parents=True, exist_ok=True)
# Load cache
cache = {}
@@ -57,9 +65,8 @@ def _check_for_updates(current_version):
# Show update message only when PyPI version is strictly newer
latest = cache.get("latest_version")
- current = cache.get("current_version")
- if latest and current and _is_newer(latest, current):
- msg = f"\nNEW MCPower VERSION AVAILABLE: {latest} (you have {current})\nUpdate now: pip install --upgrade MCPower\n"
+ if latest and _is_newer(latest, current_version):
+ msg = f"\nNEW MCPower VERSION AVAILABLE: {latest} (you have {current_version})\nUpdate now: pip install --upgrade MCPower\n"
warnings.warn(msg, stacklevel=3)
@@ -77,7 +84,10 @@ def _get_latest_version():
"""Fetch the latest MCPower version string from the PyPI JSON API."""
try:
with urllib.request.urlopen("https://pypi.org/pypi/MCPower/json", timeout=5) as response:
- data = json.loads(response.read())
+ raw = response.read(1_000_000)
+ if len(raw) >= 1_000_000:
+ return None
+ data = json.loads(raw)
return data["info"]["version"]
except Exception:
return None
diff --git a/mcpower/utils/validators.py b/mcpower/utils/validators.py
index 5853af6..1c344fd 100644
--- a/mcpower/utils/validators.py
+++ b/mcpower/utils/validators.py
@@ -27,6 +27,11 @@ class _ValidationResult:
errors: List[str]
warnings: List[str]
+ @classmethod
+ def from_errors(cls, errors: List[str], warnings: Optional[List[str]] = None) -> "_ValidationResult":
+ """Create a result from error/warning lists, deriving ``is_valid`` automatically."""
+ return cls(len(errors) == 0, errors, warnings or [])
+
def raise_if_invalid(self):
"""Raise ``ValueError`` if the validation failed."""
if not self.is_valid:
@@ -88,12 +93,12 @@ def _validate_numeric_parameter(
errors.append(range_error)
# Rounding warning for floats when int expected
- if allow_rounding and isinstance(value, float) and (int, float) in expected_types:
+ if allow_rounding and isinstance(value, float) and int in expected_types:
rounded = int(round(value))
if value != rounded:
warnings.append(f"{name} rounded from {value} to {rounded}")
- return _ValidationResult(len(errors) == 0, errors, warnings)
+ return _ValidationResult.from_errors(errors, warnings)
def _validate_power(power: Any) -> _ValidationResult:
@@ -112,6 +117,8 @@ def _validate_simulations(n_simulations: Any) -> Tuple[int, _ValidationResult]:
if result.is_valid:
rounded = int(round(n_simulations))
+ # 800 simulations threshold: below this, Monte Carlo standard error
+ # exceeds ~1.5% for power near 50%, reducing result reliability.
if rounded < 800:
result.warnings.append(f"Low simulation count ({rounded}). Consider using at least 1000 for reliable results.")
return rounded, result
@@ -139,7 +146,7 @@ def _validate_sample_size(sample_size: Any) -> _ValidationResult:
f"sample_size too large ({sample_size:,}). Maximum recommended: 100,000. We cannot guarantee stability for such small p-values."
)
- return _ValidationResult(len(errors) == 0, errors, [])
+ return _ValidationResult.from_errors(errors)
def _validate_sample_size_for_model(sample_size: int, n_variables: int) -> _ValidationResult:
@@ -157,6 +164,8 @@ def _validate_sample_size_for_model(sample_size: int, n_variables: int) -> _Vali
_ValidationResult with errors if sample size is insufficient.
"""
errors = []
+ # Green's rule of thumb: N >= 15 + p for adequate power in regression,
+ # where p is the number of predictors (design matrix columns).
min_required = 15 + n_variables
if sample_size < min_required:
@@ -165,7 +174,7 @@ def _validate_sample_size_for_model(sample_size: int, n_variables: int) -> _Vali
f"variables. Minimum required: {min_required} (15 + {n_variables} variables)."
)
- return _ValidationResult(len(errors) == 0, errors, [])
+ return _ValidationResult.from_errors(errors)
def _validate_sample_size_range(from_size: Any, to_size: Any, by: Any) -> _ValidationResult:
@@ -193,7 +202,7 @@ def _validate_sample_size_range(from_size: Any, to_size: Any, by: Any) -> _Valid
if n_tests > 100:
warnings.append(f"Large number of sample sizes to test ({n_tests}). This may take significant time.")
- return _ValidationResult(len(errors) == 0, errors, warnings)
+ return _ValidationResult.from_errors(errors, warnings)
def _validate_correlation_matrix(
@@ -226,12 +235,14 @@ def _validate_correlation_matrix(
# Positive semi-definite check
try:
eigenvals = np.linalg.eigvals(corr_matrix)
+ # -1e-8 tolerance for positive semi-definiteness: allows small negative
+ # eigenvalues from floating-point rounding in correlation matrices.
if np.any(eigenvals < -1e-8): # Tolerance for floating point noise
errors.append("Correlation matrix must be positive semi-definite. ")
except np.linalg.LinAlgError:
errors.append("Cannot compute eigenvalues of correlation matrix")
- return _ValidationResult(len(errors) == 0, errors, [])
+ return _ValidationResult.from_errors(errors)
def _validate_correction_method(correction: Optional[str]) -> _ValidationResult:
@@ -285,7 +296,7 @@ def _validate_parallel_settings(enable: Any, n_cores: Optional[int]) -> Tuple[Tu
else:
validated_n_cores = min(n_cores, max_cores)
- return (enable, validated_n_cores), _ValidationResult(len(errors) == 0, errors, [])
+ return (enable, validated_n_cores), _ValidationResult.from_errors(errors)
def _validate_model_ready(model) -> _ValidationResult:
@@ -301,9 +312,10 @@ def _validate_model_ready(model) -> _ValidationResult:
errors: List[str] = []
warnings: List[str] = []
- # Check effect sizes - check if pending effects were set
- has_effects = hasattr(model, "_pending_effects") and model._pending_effects is not None
- if not has_effects:
+ # Check effect sizes — pending (pre-apply) or flagged as set by user
+ has_pending = hasattr(model, "_pending_effects") and model._pending_effects is not None
+ has_set = hasattr(model, "_effects_set") and model._effects_set
+ if not has_pending and not has_set:
if hasattr(model, "_registry"):
available = model._registry.effect_names
errors.append(
@@ -318,7 +330,7 @@ def _validate_model_ready(model) -> _ValidationResult:
if not hasattr(model, attr):
errors.append(f"Model missing required attribute: {attr}")
- return _ValidationResult(len(errors) == 0, errors, warnings)
+ return _ValidationResult.from_errors(errors, warnings)
def _validate_test_formula(test_formula: str, available_variables: List[str]) -> _ValidationResult:
@@ -361,7 +373,7 @@ def _validate_test_formula(test_formula: str, available_variables: List[str]) ->
f"Variables not found in original model: {', '.join(sorted(missing_vars))}. Available: {', '.join(available_variables)}"
)
- return _ValidationResult(len(errors) == 0, errors, [])
+ return _ValidationResult.from_errors(errors)
except Exception as e:
errors.append(f"Error parsing test_formula: {str(e)}")
@@ -399,6 +411,8 @@ def _validate_factor_specification(n_levels: int, proportions: List[float]) -> _
# Check if they sum to approximately 1
if not errors: # Only if no errors with individual proportions
total = sum(proportions)
+ # 1e-6 tolerance: proportions are normalized later, so small deviations
+ # from 1.0 are acceptable and only warrant a warning.
if abs(total - 1.0) > 1e-6:
warnings.append(f"Proportions sum to {total:.4f}, not 1.0 (will be normalized)")
@@ -406,7 +420,7 @@ def _validate_factor_specification(n_levels: int, proportions: List[float]) -> _
if n_levels > 10:
warnings.append(f"Factor has {n_levels} levels. This creates {n_levels - 1} dummy variables, which may require large sample sizes")
- return _ValidationResult(len(errors) == 0, errors, warnings)
+ return _ValidationResult.from_errors(errors, warnings)
def _validate_upload_data(data: np.ndarray) -> _ValidationResult:
@@ -425,7 +439,7 @@ def _validate_upload_data(data: np.ndarray) -> _ValidationResult:
if data.shape[0] < 25:
errors.append(f"Need at least 25 samples for reliable quantile matching, got {data.shape[0]}")
- return _ValidationResult(len(errors) == 0, errors, [])
+ return _ValidationResult.from_errors(errors)
def _validate_cluster_config(
@@ -475,7 +489,7 @@ def _validate_cluster_config(
if not isinstance(cluster_size, int) or cluster_size < 5:
errors.append(f"cluster_size must be an integer >= 5 for reliable mixed model estimation. Got {cluster_size}.")
- return _ValidationResult(len(errors) == 0, errors, warnings)
+ return _ValidationResult.from_errors(errors, warnings)
def _validate_cluster_sample_size(
@@ -517,4 +531,4 @@ def _validate_cluster_sample_size(
f"Small cluster sizes may cause convergence issues or biased variance estimates."
)
- return _ValidationResult(len(errors) == 0, errors, warnings)
+ return _ValidationResult.from_errors(errors, warnings)
diff --git a/mcpower/utils/visualization.py b/mcpower/utils/visualization.py
index 22544a2..ab7eadd 100644
--- a/mcpower/utils/visualization.py
+++ b/mcpower/utils/visualization.py
@@ -18,6 +18,7 @@ def _create_power_plot(
target_tests: List[str],
target_power: float,
title: str,
+ show: bool = True,
):
"""Create a sample-size vs. power line plot with achievement markers.
@@ -58,7 +59,7 @@ def _create_power_plot(
)
# Mark achievement point
- if first_achieved[test] > 0:
+ if first_achieved[test] > 0 and first_achieved[test] in sample_sizes:
achieved_idx = sample_sizes.index(first_achieved[test])
achieved_power = powers[achieved_idx]
ax.plot(
@@ -112,4 +113,6 @@ def _create_power_plot(
color="#888888",
)
plt.tight_layout(rect=(0, 0.03, 1, 1))
- plt.show()
+ if show:
+ plt.show()
+ return fig
diff --git a/pyproject.toml b/pyproject.toml
index 983e3d3..ec1f41f 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,14 +1,14 @@
[build-system]
requires = [
- "scikit-build-core>=0.5",
+ "scikit-build-core>=0.10",
"pybind11>=2.11",
- "numpy>=2.0.0",
+ "numpy>=1.26.0",
]
build-backend = "scikit_build_core.build"
[project]
name = "MCPower"
-version = "0.5.4"
+version = "0.6.0"
description = "Monte Carlo Power Analysis for Statistical Models"
readme = "README.md"
license = {text = "GPL-3.0-or-later"}
@@ -31,9 +31,10 @@ classifiers = [
]
requires-python = ">=3.10"
dependencies = [
- "numpy>=2.0.0",
+ "numpy>=1.26.0",
"matplotlib>=3.8.0",
"joblib>=1.3.0",
+ "tqdm>=4.60.0",
]
[project.optional-dependencies]
@@ -41,6 +42,7 @@ lme = ["statsmodels>=0.14.0"]
pandas = ["pandas>=2.0.0"]
dev = [
"pandas>=2.0.0",
+ "statsmodels>=0.14.0",
"pytest>=7.0.0",
"pytest-cov>=4.0.0",
"scipy>=1.11.0",
@@ -52,14 +54,14 @@ dev = [
]
all = [
"pandas>=2.0.0",
- "statsmodels>=0.14.0",
-]
+ ]
[project.urls]
Homepage = "https://github.com/pawlenartowicz/MCPower"
-Documentation = "https://github.com/pawlenartowicz/MCPower#readme"
+Documentation = "https://github.com/pawlenartowicz/MCPower/wiki"
Repository = "https://github.com/pawlenartowicz/MCPower"
Issues = "https://github.com/pawlenartowicz/MCPower/issues"
+Changelog = "https://github.com/pawlenartowicz/MCPower/blob/main/CHANGELOG.md"
[tool.scikit-build]
wheel.packages = ["mcpower"]
@@ -77,8 +79,6 @@ python_files = ["test_*.py"]
python_classes = ["Test*"]
python_functions = ["test_*"]
markers = [
- "unit: Unit tests",
- "integration: Integration tests",
"lme: LME mixed-effects model tests",
]
addopts = "-v --tb=short --strict-markers"
@@ -86,7 +86,6 @@ filterwarnings = [
"ignore::FutureWarning",
"ignore::DeprecationWarning",
"ignore::UserWarning:statsmodels",
- "ignore:Mixed-effects models are experimental:UserWarning",
]
[tool.ruff]
@@ -107,6 +106,20 @@ known-first-party = ["mcpower"]
python_version = "3.10"
warn_return_any = true
warn_unused_configs = true
-ignore_missing_imports = true
check_untyped_defs = true
exclude = ["build", "dist", "tests"]
+
+[[tool.mypy.overrides]]
+module = [
+ "mcpower_native",
+ "mcpower_native.*",
+ "statsmodels",
+ "statsmodels.*",
+ "tqdm",
+ "tqdm.*",
+ "joblib",
+ "joblib.*",
+ "pandas",
+ "pandas.*",
+]
+ignore_missing_imports = true
diff --git a/tests/config.py b/tests/config.py
index 63c1999..a8b874c 100644
--- a/tests/config.py
+++ b/tests/config.py
@@ -5,9 +5,21 @@
across the test suite.
"""
-# Monte Carlo simulation parameters
-N_SIMS = 5000
-"""Number of Monte Carlo simulations for power analysis tests."""
+# Monte Carlo simulation parameters — 4-tier ladder
+N_SIMS_CHECK = 50
+"""Smoke tests — just verify no crash, structure, API contract."""
+
+N_SIMS_ORDERING = 1000
+"""Ordering tests — monotonicity, correction hierarchy, A < B checks."""
+
+N_SIMS_STANDARD = 1600
+"""Standard tests — null calibration, Type I error, general validation."""
+
+N_SIMS_ACCURACY = 5000
+"""Accuracy tests — comparison against analytical power formulas."""
+
+N_SIMS = N_SIMS_ACCURACY
+"""Backward-compat alias for accuracy-level simulations."""
SEED = 2137
"""Default random seed for reproducibility."""
diff --git a/tests/conftest.py b/tests/conftest.py
index 93c8814..d2100b5 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -63,12 +63,6 @@ def correlation_matrix_2x2():
return np.array([[1.0, 0.5], [0.5, 1.0]])
-@pytest.fixture
-def correlation_matrix_3x3():
- """Create a 3x3 correlation matrix."""
- return np.array([[1.0, 0.3, 0.2], [0.3, 1.0, 0.4], [0.2, 0.4, 1.0]])
-
-
@pytest.fixture
def sample_data():
"""Create sample empirical data."""
@@ -80,41 +74,13 @@ def sample_data():
@pytest.fixture
-def suppress_output(capsys):
- """Suppress print output during tests by capturing it."""
- yield
- # Output is automatically captured by capsys
-
-
-BACKENDS = ["c++"]
-
-
-@pytest.fixture(params=BACKENDS)
-def backend(request):
- """
- Force MCPower to run on a specific backend.
-
- Parametrizes tests against C++ (primary backend).
- Automatically resets backend after each test.
- """
- from mcpower.backends import reset_backend, set_backend
-
- set_backend(request.param)
- yield request.param
- reset_backend()
-
-
-@pytest.fixture(autouse=True)
-def reset_backend_after_test():
- """
- Automatically reset backend to default after every test.
-
- Ensures no hidden backend state leaks between tests.
- """
- yield
- from mcpower.backends import reset_backend
+def suppress_output():
+ """Suppress print output during tests."""
+ import contextlib
+ import io
- reset_backend()
+ with contextlib.redirect_stdout(io.StringIO()):
+ yield
def _statsmodels_available():
diff --git a/tests/helpers/power_helpers.py b/tests/helpers/power_helpers.py
index d509a79..620e995 100644
--- a/tests/helpers/power_helpers.py
+++ b/tests/helpers/power_helpers.py
@@ -40,42 +40,3 @@ def compute_crits(X, target_indices, alpha=DEFAULT_ALPHA, correction_method=0):
return compute_critical_values(alpha, p, dof, n_targets, correction_method)
-def run_with_backend(
- backend_name,
- equation,
- effects_str,
- sample_size,
- n_sims,
- seed,
- target_test="all",
- correction=None,
- correlations_str=None,
- alpha=DEFAULT_ALPHA,
-):
- """Run a full MCPower power analysis with a specific backend forced."""
- import contextlib
- import io
-
- from mcpower import MCPower
- from mcpower.backends import reset_backend, set_backend
-
- set_backend(backend_name)
- try:
- m = MCPower(equation)
- m.set_simulations(n_sims)
- m.set_seed(seed)
- m.set_alpha(alpha)
- m.set_effects(effects_str)
- if correlations_str:
- m.set_correlations(correlations_str)
- with contextlib.redirect_stdout(io.StringIO()):
- result = m.find_power(
- sample_size=sample_size,
- target_test=target_test,
- correction=correction,
- print_results=False,
- return_results=True,
- )
- finally:
- reset_backend()
- return result
diff --git a/tests/integration/test_find_power_api.py b/tests/integration/test_find_power_api.py
index 69f6bc5..fe56de3 100644
--- a/tests/integration/test_find_power_api.py
+++ b/tests/integration/test_find_power_api.py
@@ -222,14 +222,14 @@ def test_all_targets(self, suppress_output):
class TestHeterogeneity:
- """Test heterogeneity settings."""
+ """Test heterogeneity via scenario configs."""
def test_with_heterogeneity(self, suppress_output):
from mcpower import MCPower
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.3, x2=0.2")
- model.set_heterogeneity(0.1)
+ model.set_scenario_configs({"het": {"heterogeneity": 0.1}})
result = model.find_power(100, print_results=False, return_results=True)
assert result is not None
@@ -239,7 +239,7 @@ def test_with_heteroskedasticity(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.3, x2=0.2")
- model.set_heteroskedasticity(0.2)
+ model.set_scenario_configs({"hsked": {"heteroskedasticity": 0.2}})
result = model.find_power(100, print_results=False, return_results=True)
assert result is not None
@@ -249,8 +249,7 @@ def test_combined(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.3, x2=0.2")
- model.set_heterogeneity(0.1)
- model.set_heteroskedasticity(0.2)
+ model.set_scenario_configs({"combo": {"heterogeneity": 0.1, "heteroskedasticity": 0.2}})
result = model.find_power(100, print_results=False, return_results=True)
assert result is not None
@@ -330,7 +329,7 @@ def test_all_features_combined(self, suppress_output):
model.upload_data({"x1": np.random.exponential(2, 100)})
model.set_correlations("(x1,x2)=0.3")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2, x2=0.15, x1:x2=0.1")
- model.set_heterogeneity(0.05)
+ model.set_scenario_configs({"test": {"heterogeneity": 0.05}})
result = model.find_power(200, print_results=False, return_results=True)
assert result is not None
diff --git a/tests/integration/test_model.py b/tests/integration/test_model.py
index da00d33..a57b051 100644
--- a/tests/integration/test_model.py
+++ b/tests/integration/test_model.py
@@ -86,16 +86,6 @@ def test_set_variable_type(self, suppress_output):
assert model._pending_variable_types == "group=(factor,3)"
assert model._applied is False
- def test_set_heterogeneity(self, simple_model):
- simple_model.set_heterogeneity(0.1)
- assert simple_model._pending_heterogeneity == 0.1
- assert simple_model._applied is False
-
- def test_set_heteroskedasticity(self, simple_model):
- simple_model.set_heteroskedasticity(0.2)
- assert simple_model._pending_heteroskedasticity == 0.2
- assert simple_model._applied is False
-
def test_upload_data_dict(self, simple_model, sample_data):
simple_model.upload_data(sample_data)
assert simple_model._pending_data is not None
@@ -131,12 +121,12 @@ class TestApply:
"""Test apply() method."""
def test_apply_sets_flag(self, configured_model):
- configured_model.apply()
+ configured_model._apply()
assert configured_model._applied is True
def test_apply_processes_effects(self, simple_model):
simple_model.set_effects("x1=0.5, x2=0.3")
- simple_model.apply()
+ simple_model._apply()
effect_sizes = simple_model._registry.get_effect_sizes()
assert effect_sizes[0] == 0.5
assert effect_sizes[1] == 0.3
@@ -147,22 +137,17 @@ def test_apply_processes_variable_types(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
assert len(model._registry.factor_names) == 1
assert len(model._registry.dummy_names) == 2
def test_apply_processes_correlations(self, simple_model):
simple_model.set_effects("x1=0.3, x2=0.2")
simple_model.set_correlations("(x1,x2)=0.5")
- simple_model.apply()
+ simple_model._apply()
corr = simple_model.correlation_matrix
assert corr[0, 1] == 0.5
- def test_apply_processes_heterogeneity(self, configured_model):
- configured_model.set_heterogeneity(0.15)
- configured_model.apply()
- assert configured_model.heterogeneity == 0.15
-
def test_apply_order_independence(self, suppress_output):
"""Test that set_* methods can be called in any order."""
from mcpower import MCPower
@@ -172,14 +157,14 @@ def test_apply_order_independence(self, suppress_output):
m1.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2, x2=0.1")
m1.set_variable_type("group=(factor,3)")
m1.set_correlations("(x1,x2)=0.5")
- m1.apply()
+ m1._apply()
# Order 2: variable_type, correlations, effects
m2 = MCPower("y = group + x1 + x2")
m2.set_variable_type("group=(factor,3)")
m2.set_correlations("(x1,x2)=0.5")
m2.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2, x2=0.1")
- m2.apply()
+ m2._apply()
# Both should have same effect sizes
assert np.allclose(m1._registry.get_effect_sizes(), m2._registry.get_effect_sizes())
@@ -242,7 +227,7 @@ def test_sample_sizes_tested(self, configured_model):
assert result["results"]["sample_sizes_tested"] == [50, 75, 100]
def test_first_achieved(self, configured_model):
- result = configured_model.find_sample_size(from_size=50, to_size=200, by=25, print_results=False, return_results=True)
+ result = configured_model.find_sample_size(from_size=50, to_size=200, by=50, print_results=False, return_results=True)
assert "first_achieved" in result["results"]
def test_find_sample_size_runs(self, configured_model):
@@ -257,7 +242,7 @@ class TestErrors:
def test_invalid_effect_name(self, simple_model):
simple_model.set_effects("invalid=0.3")
with pytest.raises(ValueError, match="not found"):
- simple_model.apply()
+ simple_model._apply()
def test_missing_effects(self, simple_model):
with pytest.raises(ValueError, match="Effect sizes must be set"):
@@ -290,7 +275,7 @@ def test_basic_named_levels(self):
model = MCPower("y = treatment + x1")
model.set_factor_levels("treatment=placebo,drug_a,drug_b")
model.set_effects("treatment[drug_a]=0.5, treatment[drug_b]=0.8, x1=0.3")
- model.apply()
+ model._apply()
assert "treatment" in model._registry.factor_names
assert "treatment[drug_a]" in model._registry.dummy_names
assert "treatment[drug_b]" in model._registry.dummy_names
@@ -302,7 +287,7 @@ def test_multiple_factors(self):
model = MCPower("y = group + dose")
model.set_factor_levels("group=control,treatment; dose=low,medium,high")
model.set_effects("group[treatment]=0.5, dose[medium]=0.3, dose[high]=0.6")
- model.apply()
+ model._apply()
assert "group[treatment]" in model._registry.dummy_names
assert "dose[medium]" in model._registry.dummy_names
assert "dose[high]" in model._registry.dummy_names
@@ -313,7 +298,7 @@ def test_unknown_variable_raises(self):
model = MCPower("y = x1")
with pytest.raises(ValueError, match="not found"):
model.set_factor_levels("unknown=a,b,c")
- model.apply()
+ model._apply()
def test_single_level_raises(self):
from mcpower import MCPower
@@ -321,7 +306,7 @@ def test_single_level_raises(self):
model = MCPower("y = x1")
with pytest.raises(ValueError, match="at least 2"):
model.set_factor_levels("x1=only_one")
- model.apply()
+ model._apply()
def test_find_power_with_named_levels(self):
"""End-to-end: find_power works with set_factor_levels."""
diff --git a/tests/integration/test_parallel.py b/tests/integration/test_parallel.py
index 682372c..2048d7e 100644
--- a/tests/integration/test_parallel.py
+++ b/tests/integration/test_parallel.py
@@ -4,6 +4,8 @@
import pytest
+from tests.config import N_SIMS_CHECK
+
def _joblib_available():
"""Check if joblib is available."""
@@ -26,6 +28,7 @@ def test_parallel_results_match_sequential(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.3, x2=0.2")
model.set_seed(42)
+ model.set_simulations(N_SIMS_CHECK)
# Run sequential analysis
model.set_parallel(False)
@@ -56,6 +59,7 @@ def test_parallel_with_scenarios(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.3, x2=0.2")
model.set_seed(42)
+ model.set_simulations(N_SIMS_CHECK)
# Run sequential with scenarios
model.set_parallel(False)
@@ -93,6 +97,7 @@ def test_parallel_with_interactions(self, suppress_output):
model = MCPower("y = a + b + a:b")
model.set_effects("a=0.4, b=0.3, a:b=0.2")
model.set_seed(42)
+ model.set_simulations(N_SIMS_CHECK)
# Run sequential
model.set_parallel(False)
@@ -128,6 +133,7 @@ def test_parallel_fallback_on_failure(self, suppress_output, monkeypatch):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.3, x2=0.2")
model.set_seed(42)
+ model.set_simulations(N_SIMS_CHECK)
model.set_parallel(True, n_cores=2)
# Mock joblib.Parallel to raise an exception
@@ -157,6 +163,7 @@ def test_find_power_ignores_parallel(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.3, x2=0.2")
model.set_seed(42)
+ model.set_simulations(N_SIMS_CHECK)
# Run with parallel=False
model.set_parallel(False)
diff --git a/tests/integration/test_posthoc_integration.py b/tests/integration/test_posthoc_integration.py
index 9499388..b6b7f21 100644
--- a/tests/integration/test_posthoc_integration.py
+++ b/tests/integration/test_posthoc_integration.py
@@ -15,7 +15,7 @@ def test_parse_vs_syntax(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("group[1] vs group[2]")
assert "group[1] vs group[2]" in tests
@@ -27,7 +27,7 @@ def test_parse_multiple_vs(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("group[1] vs group[2], group[2] vs group[3]")
assert "group[1] vs group[2]" in tests
@@ -40,7 +40,7 @@ def test_parse_mixed_regular_and_posthoc(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("overall, group[1] vs group[2]")
assert "overall" in tests
@@ -52,7 +52,7 @@ def test_all_does_not_include_posthoc(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("all")
# "all" should NOT include any post-hoc comparisons
@@ -66,7 +66,7 @@ def test_invalid_factor_name(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
with pytest.raises(ValueError, match="Factor.*not found"):
model._parse_target_tests("notafactor[1] vs notafactor[2]")
@@ -77,7 +77,7 @@ def test_invalid_level(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
with pytest.raises(ValueError, match="out of range"):
model._parse_target_tests("group[0] vs group[5]")
@@ -88,7 +88,7 @@ def test_same_level_comparison_rejected(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
with pytest.raises(ValueError, match="Cannot compare a level to itself"):
model._parse_target_tests("group[2] vs group[2]")
@@ -99,7 +99,7 @@ def test_cross_factor_comparison_rejected(self, suppress_output):
model = MCPower("y = a + b")
model.set_variable_type("a=(factor,3), b=(factor,2)")
model.set_effects("a[2]=0.3, a[3]=0.2, b[2]=0.1")
- model.apply()
+ model._apply()
with pytest.raises(ValueError, match="same factor"):
model._parse_target_tests("a[1] vs b[1]")
@@ -396,7 +396,7 @@ def test_all_posthoc_keyword(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("all-posthoc")
# 3-level factor → C(3,2) = 3 pairs
@@ -415,7 +415,7 @@ def test_all_plus_all_posthoc(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("all, all-posthoc")
# "all" → overall + group[2] + group[3] + x1 = 4
@@ -434,7 +434,7 @@ def test_all_posthoc_multiple_factors(self, suppress_output):
model = MCPower("y = a + b")
model.set_variable_type("a=(factor,3), b=(factor,2)")
model.set_effects("a[2]=0.3, a[3]=0.2, b[2]=0.1")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("all-posthoc")
# a: C(3,2)=3, b: C(2,2)=1 → 4 total
@@ -448,7 +448,7 @@ def test_all_posthoc_no_factors_with_all(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.5, x2=0.3")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("all, all-posthoc")
assert "overall" in tests
@@ -461,7 +461,7 @@ def test_all_posthoc_alone_no_factors_raises(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.5, x2=0.3")
- model.apply()
+ model._apply()
with pytest.raises(ValueError, match="no factor variables"):
model._parse_target_tests("all-posthoc")
@@ -472,7 +472,7 @@ def test_exclusion_removes_test(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.5, x2=0.3")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("all, -overall")
assert "overall" not in tests
@@ -486,7 +486,7 @@ def test_exclusion_posthoc(self, suppress_output):
model = MCPower("y = group + x1")
model.set_variable_type("group=(factor,3)")
model.set_effects("group[2]=0.4, group[3]=0.3, x1=0.2")
- model.apply()
+ model._apply()
tests = model._parse_target_tests("all-posthoc, -group[1] vs group[2]")
assert "group[1] vs group[2]" not in tests
@@ -500,7 +500,7 @@ def test_exclusion_invalid_raises(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.5, x2=0.3")
- model.apply()
+ model._apply()
with pytest.raises(ValueError, match="does not match"):
model._parse_target_tests("all, -nonexistent")
@@ -511,7 +511,7 @@ def test_exclusion_all_raises(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.5, x2=0.3")
- model.apply()
+ model._apply()
with pytest.raises(ValueError, match="nothing left"):
model._parse_target_tests("all, -overall, -x1, -x2")
@@ -522,7 +522,7 @@ def test_duplicate_raises(self, suppress_output):
model = MCPower("y = x1 + x2")
model.set_effects("x1=0.5, x2=0.3")
- model.apply()
+ model._apply()
with pytest.raises(ValueError, match="Duplicate"):
model._parse_target_tests("all, x1")
diff --git a/tests/integration/test_scenarios.py b/tests/integration/test_scenarios.py
index 5172fb3..b0339cc 100644
--- a/tests/integration/test_scenarios.py
+++ b/tests/integration/test_scenarios.py
@@ -5,8 +5,10 @@
from unittest.mock import MagicMock
import numpy as np
+import pytest
from mcpower.core.scenarios import (
+ DEFAULT_SCENARIO_CONFIG,
ScenarioRunner,
apply_per_simulation_perturbations,
)
@@ -82,6 +84,152 @@ def test_create_scenario_plots_early_return(self):
runner._create_scenario_plots({"scenarios": {"optimistic": {}}})
+class TestSetScenarioConfigs:
+ """Test set_scenario_configs() merge behavior and KeyError prevention."""
+
+ # All keys that must exist in every scenario config
+ ALL_KEYS = sorted(DEFAULT_SCENARIO_CONFIG["optimistic"].keys())
+
+ def _make_model(self):
+ from mcpower import MCPower
+
+ m = MCPower("y = x1 + x2")
+ m.set_effects("x1=0.3, x2=0.2")
+ return m
+
+ # ── Merge semantics ──────────────────────────────────────────
+
+ def test_custom_scenario_inherits_all_optimistic_keys(self):
+ """New custom scenario with one key still has every required key."""
+ m = self._make_model()
+ m.set_scenario_configs({"extreme": {"heterogeneity": 0.6}})
+ cfg = m._scenario_configs["extreme"]
+ missing = set(self.ALL_KEYS) - set(cfg.keys())
+ assert not missing, f"Missing keys: {missing}"
+
+ def test_custom_scenario_overrides_value(self):
+ """Provided key overrides the optimistic default."""
+ m = self._make_model()
+ m.set_scenario_configs({"extreme": {"heterogeneity": 0.6}})
+ assert m._scenario_configs["extreme"]["heterogeneity"] == 0.6
+
+ def test_custom_scenario_non_overridden_keys_are_optimistic(self):
+ """Non-overridden keys equal the optimistic baseline."""
+ m = self._make_model()
+ m.set_scenario_configs({"extreme": {"heterogeneity": 0.6}})
+ opt = DEFAULT_SCENARIO_CONFIG["optimistic"]
+ cfg = m._scenario_configs["extreme"]
+ for key in self.ALL_KEYS:
+ if key != "heterogeneity":
+ assert cfg[key] == opt[key], f"Key {key}: {cfg[key]} != {opt[key]}"
+
+ def test_existing_scenario_update_preserves_other_keys(self):
+ """Updating one key on 'realistic' keeps the rest intact."""
+ m = self._make_model()
+ m.set_scenario_configs({"realistic": {"heterogeneity": 0.99}})
+ cfg = m._scenario_configs["realistic"]
+ assert cfg["heterogeneity"] == 0.99
+ # Other keys should match original realistic defaults
+ assert cfg["correlation_noise_sd"] == DEFAULT_SCENARIO_CONFIG["realistic"]["correlation_noise_sd"]
+
+ def test_defaults_still_present_after_adding_custom(self):
+ """Adding a custom scenario doesn't remove optimistic/realistic/doomer."""
+ m = self._make_model()
+ m.set_scenario_configs({"custom": {"heterogeneity": 0.1}})
+ for name in ("optimistic", "realistic", "doomer", "custom"):
+ assert name in m._scenario_configs
+
+ def test_multiple_custom_scenarios(self):
+ """Multiple custom scenarios each inherit independently."""
+ m = self._make_model()
+ m.set_scenario_configs({
+ "mild": {"heterogeneity": 0.05},
+ "severe": {"heterogeneity": 0.8, "heteroskedasticity": 0.5},
+ })
+ assert m._scenario_configs["mild"]["heterogeneity"] == 0.05
+ assert m._scenario_configs["mild"]["heteroskedasticity"] == 0.0 # optimistic default
+ assert m._scenario_configs["severe"]["heterogeneity"] == 0.8
+ assert m._scenario_configs["severe"]["heteroskedasticity"] == 0.5
+
+ def test_empty_custom_scenario_equals_optimistic(self):
+ """An empty custom config is identical to the optimistic baseline."""
+ m = self._make_model()
+ m.set_scenario_configs({"empty": {}})
+ opt = DEFAULT_SCENARIO_CONFIG["optimistic"]
+ for key in self.ALL_KEYS:
+ assert m._scenario_configs["empty"][key] == opt[key]
+
+ # ── Type validation ──────────────────────────────────────────
+
+ def test_non_dict_raises_type_error(self):
+ m = self._make_model()
+ with pytest.raises(TypeError):
+ m.set_scenario_configs("not_a_dict")
+
+ def test_returns_self_for_chaining(self):
+ m = self._make_model()
+ result = m.set_scenario_configs({"custom": {"heterogeneity": 0.1}})
+ assert result is m
+
+ # ── End-to-end: no KeyError during simulation ────────────────
+
+ def test_custom_partial_config_runs_without_error(self):
+ """Custom scenario with only one key runs find_power without KeyError."""
+ m = self._make_model()
+ m.set_scenario_configs({"partial": {"heterogeneity": 0.3}})
+ result = m.find_power(
+ 50, scenarios=True, print_results=False, return_results=True
+ )
+ assert "partial" in result["scenarios"]
+ power = result["scenarios"]["partial"]["results"]["individual_powers"]["overall"]
+ assert 0 <= power <= 100
+
+ def test_custom_residual_only_config_runs(self):
+ """Custom scenario with only residual keys runs without error."""
+ m = self._make_model()
+ m.set_scenario_configs({
+ "residual_test": {
+ "residual_change_prob": 1.0,
+ "residual_dists": ["heavy_tailed"],
+ "residual_df": 5,
+ }
+ })
+ result = m.find_power(
+ 50, scenarios=True, print_results=False, return_results=True
+ )
+ assert "residual_test" in result["scenarios"]
+
+ def test_custom_lme_keys_on_ols_model_ignored(self):
+ """LME-specific keys on an OLS model don't cause errors."""
+ m = self._make_model()
+ m.set_scenario_configs({
+ "lme_on_ols": {
+ "icc_noise_sd": 0.3,
+ "random_effect_dist": "heavy_tailed",
+ "random_effect_df": 3,
+ }
+ })
+ result = m.find_power(
+ 50, scenarios=True, print_results=False, return_results=True
+ )
+ assert "lme_on_ols" in result["scenarios"]
+
+ def test_overriding_all_three_defaults(self):
+ """Overriding optimistic, realistic, and doomer all at once."""
+ m = self._make_model()
+ m.set_scenario_configs({
+ "optimistic": {"heterogeneity": 0.01},
+ "realistic": {"heterogeneity": 0.5},
+ "doomer": {"heterogeneity": 0.9},
+ })
+ assert m._scenario_configs["optimistic"]["heterogeneity"] == 0.01
+ assert m._scenario_configs["realistic"]["heterogeneity"] == 0.5
+ assert m._scenario_configs["doomer"]["heterogeneity"] == 0.9
+ # Other keys preserved from defaults
+ assert m._scenario_configs["realistic"]["correlation_noise_sd"] == DEFAULT_SCENARIO_CONFIG["realistic"]["correlation_noise_sd"]
+ assert m._scenario_configs["doomer"]["correlation_noise_sd"] == DEFAULT_SCENARIO_CONFIG["doomer"]["correlation_noise_sd"]
+
+
class TestApplyPerSimulationPerturbations:
"""Test apply_per_simulation_perturbations function."""
@@ -122,3 +270,152 @@ def test_var_type_perturbation(self):
# All normal (type 0) vars should be changed to right_skewed (type 2)
assert np.all(p_types == 2)
+
+
+class TestScenarioConfigKeysE2E:
+ """End-to-end tests for each individual config key and mixed combinations.
+
+ Each test verifies that setting a single config key (or combination)
+ via set_scenario_configs() runs find_power(scenarios=True) without
+ error and produces valid power values.
+ """
+
+ N_SIMS = 50
+ SAMPLE_SIZE = 80
+
+ def _make_model(self):
+ from mcpower import MCPower
+
+ m = MCPower("y = x1 + x2")
+ m.set_effects("x1=0.3, x2=0.2")
+ m.set_simulations(self.N_SIMS)
+ return m
+
+ def _run(self, model, config, scenario_name="test_scenario"):
+ model.set_scenario_configs({scenario_name: config})
+ result = model.find_power(
+ self.SAMPLE_SIZE,
+ scenarios=True,
+ print_results=False,
+ return_results=True,
+ )
+ power = result["scenarios"][scenario_name]["results"]["individual_powers"]["overall"]
+ assert 0 <= power <= 100, f"Power out of range: {power}"
+ return result
+
+ # ── Individual general keys ───────────────────────────────────
+
+ def test_heterogeneity_only(self):
+ self._run(self._make_model(), {"heterogeneity": 0.3})
+
+ def test_heteroskedasticity_only(self):
+ self._run(self._make_model(), {"heteroskedasticity": 0.2})
+
+ def test_correlation_noise_sd_only(self):
+ m = self._make_model()
+ m.set_correlations("(x1,x2)=0.4")
+ self._run(m, {"correlation_noise_sd": 0.3})
+
+ def test_distribution_change_prob_only(self):
+ self._run(self._make_model(), {"distribution_change_prob": 0.5})
+
+ def test_new_distributions_with_change_prob(self):
+ self._run(self._make_model(), {
+ "distribution_change_prob": 1.0,
+ "new_distributions": ["uniform"],
+ })
+
+ # ── Individual residual keys ──────────────────────────────────
+
+ def test_residual_change_prob_only(self):
+ self._run(self._make_model(), {"residual_change_prob": 0.5})
+
+ def test_residual_df_only(self):
+ self._run(self._make_model(), {
+ "residual_change_prob": 1.0,
+ "residual_df": 3,
+ })
+
+ def test_residual_dists_only(self):
+ self._run(self._make_model(), {
+ "residual_change_prob": 1.0,
+ "residual_dists": ["heavy_tailed"],
+ })
+
+ # ── Mixed general combinations ────────────────────────────────
+
+ def test_heterogeneity_and_correlation_noise(self):
+ m = self._make_model()
+ m.set_correlations("(x1,x2)=0.3")
+ self._run(m, {
+ "heterogeneity": 0.25,
+ "correlation_noise_sd": 0.3,
+ })
+
+ def test_distribution_change_and_heteroskedasticity(self):
+ self._run(self._make_model(), {
+ "distribution_change_prob": 0.5,
+ "heteroskedasticity": 0.15,
+ })
+
+ def test_all_general_keys_together(self):
+ m = self._make_model()
+ m.set_correlations("(x1,x2)=0.3")
+ self._run(m, {
+ "heterogeneity": 0.2,
+ "heteroskedasticity": 0.1,
+ "correlation_noise_sd": 0.2,
+ "distribution_change_prob": 0.3,
+ })
+
+ # ── Mixed general + residual ──────────────────────────────────
+
+ def test_general_plus_residual_keys(self):
+ self._run(self._make_model(), {
+ "heterogeneity": 0.2,
+ "residual_change_prob": 0.5,
+ "residual_df": 5,
+ })
+
+ def test_all_ols_keys_together(self):
+ m = self._make_model()
+ m.set_correlations("(x1,x2)=0.3")
+ self._run(m, {
+ "heterogeneity": 0.3,
+ "heteroskedasticity": 0.15,
+ "correlation_noise_sd": 0.25,
+ "distribution_change_prob": 0.4,
+ "new_distributions": ["right_skewed", "uniform"],
+ "residual_change_prob": 0.5,
+ "residual_dists": ["heavy_tailed", "skewed"],
+ "residual_df": 6,
+ })
+
+ # ── Boundary values ───────────────────────────────────────────
+
+ def test_zero_perturbation_matches_optimistic(self):
+ """A custom scenario with all zeros should match optimistic power."""
+ m = self._make_model()
+ m.set_seed(42)
+ result = self._run(m, {
+ "heterogeneity": 0.0,
+ "heteroskedasticity": 0.0,
+ "correlation_noise_sd": 0.0,
+ "distribution_change_prob": 0.0,
+ "residual_change_prob": 0.0,
+ })
+ opt_power = result["scenarios"]["optimistic"]["results"]["individual_powers"]["overall"]
+ custom_power = result["scenarios"]["test_scenario"]["results"]["individual_powers"]["overall"]
+ # Same seed, same zero config → should be close (not exact due to seed offsets)
+ assert abs(opt_power - custom_power) < 15
+
+ def test_max_perturbation_runs(self):
+ """Extreme perturbation values should not crash."""
+ self._run(self._make_model(), {
+ "heterogeneity": 0.9,
+ "heteroskedasticity": 0.5,
+ "correlation_noise_sd": 0.8,
+ "distribution_change_prob": 1.0,
+ "residual_change_prob": 1.0,
+ "residual_df": 2,
+ })
diff --git a/tests/integration/test_test_formula.py b/tests/integration/test_test_formula.py
new file mode 100644
index 0000000..77b2e6e
--- /dev/null
+++ b/tests/integration/test_test_formula.py
@@ -0,0 +1,388 @@
+"""
+End-to-end integration tests for the test_formula feature.
+
+The test_formula feature generates data using one model formula but fits a
+different (reduced) model for statistical testing, enabling model
+misspecification analysis (e.g. omitted variable bias).
+"""
+
+import numpy as np
+import pandas as pd
+import pytest
+
+from mcpower import MCPower
+
+N_SIMS = 200
+SEED = 42
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def _power_model(formula, effects, *, n_sims=N_SIMS, seed=SEED, **kwargs):
+ """Create a configured MCPower model ready for find_power."""
+ model = MCPower(formula)
+
+ # Apply optional configuration before effects
+ if "variable_types" in kwargs:
+ model.set_variable_type(kwargs.pop("variable_types"))
+ if "correlations" in kwargs:
+ model.set_correlations(kwargs.pop("correlations"))
+ if "cluster" in kwargs:
+ cluster_cfg = kwargs.pop("cluster")
+ model.set_cluster(**cluster_cfg)
+ if "max_failed" in kwargs:
+ model.set_max_failed_simulations(kwargs.pop("max_failed"))
+ if "upload_data" in kwargs:
+ model.upload_data(kwargs.pop("upload_data"))
+
+ model.set_effects(effects)
+ model.set_simulations(n_sims)
+ model.set_seed(seed)
+ return model
+
+
+def _run_power(model, sample_size, **kwargs):
+ """Run find_power with standard test defaults."""
+ return model.find_power(
+ sample_size,
+ print_results=False,
+ return_results=True,
+ progress_callback=False,
+ **kwargs,
+ )
+
+
+def _individual_powers(result):
+ """Extract individual_powers dict from a result."""
+ return result["results"]["individual_powers"]
+
+
+# ===========================================================================
+# Class 1: TestOLSSubset
+# ===========================================================================
+
+
+class TestOLSSubset:
+ """Test basic OLS test_formula subsetting scenarios."""
+
+ def test_omitted_variable_reduces_power(self):
+ """Omitting x3 from test formula excludes it from results."""
+ model = _power_model(
+ "y = x1 + x2 + x3",
+ "x1=0.5, x2=0.3, x3=0.5",
+ )
+ result = _run_power(model, 100, test_formula="y = x1 + x2")
+
+ powers = _individual_powers(result)
+ assert "x1" in powers
+ assert "x2" in powers
+ assert "x3" not in powers
+
+ def test_omitted_interaction(self):
+ """Omitting interaction from test formula excludes it from results."""
+ model = _power_model(
+ "y = x1 + x2 + x1:x2",
+ "x1=0.5, x2=0.3, x1:x2=0.2",
+ )
+ result = _run_power(model, 100, test_formula="y = x1 + x2")
+
+ powers = _individual_powers(result)
+ assert "x1" in powers
+ assert "x2" in powers
+ assert "x1:x2" not in powers
+
+ def test_single_variable_test(self):
+ """Testing only x1 from a 3-variable generation model."""
+ model = _power_model(
+ "y = x1 + x2 + x3",
+ "x1=0.5, x2=0.3, x3=0.2",
+ )
+ result = _run_power(model, 100, test_formula="y = x1")
+
+ powers = _individual_powers(result)
+ assert "x1" in powers
+ assert "overall" in powers
+ assert "x2" not in powers
+ assert "x3" not in powers
+
+ def test_same_formula_matches_no_test_formula(self):
+ """Using test_formula identical to generation gives same powers."""
+ model_a = _power_model("y = x1 + x2", "x1=0.5, x2=0.3")
+ result_a = _run_power(model_a, 100, test_formula="y = x1 + x2")
+
+ model_b = _power_model("y = x1 + x2", "x1=0.5, x2=0.3")
+ result_b = _run_power(model_b, 100)
+
+ powers_a = _individual_powers(result_a)
+ powers_b = _individual_powers(result_b)
+
+ for key in powers_b:
+ assert abs(powers_a[key] - powers_b[key]) < 0.01, (
+ f"Power mismatch for {key}: {powers_a[key]} vs {powers_b[key]}"
+ )
+
+ def test_empty_test_formula_uses_generation(self):
+ """Empty test_formula string uses the generation formula (default)."""
+ model_a = _power_model("y = x1 + x2", "x1=0.5, x2=0.3")
+ result_a = _run_power(model_a, 100, test_formula="")
+
+ model_b = _power_model("y = x1 + x2", "x1=0.5, x2=0.3")
+ result_b = _run_power(model_b, 100)
+
+ powers_a = _individual_powers(result_a)
+ powers_b = _individual_powers(result_b)
+
+ for key in powers_b:
+ assert abs(powers_a[key] - powers_b[key]) < 0.01, (
+ f"Power mismatch for {key}: {powers_a[key]} vs {powers_b[key]}"
+ )
+
+
+# ===========================================================================
+# Class 2: TestFactorVariables
+# ===========================================================================
+
+
+class TestFactorVariables:
+ """Test test_formula with factor (categorical) variables."""
+
+ def test_omitted_factor(self):
+ """Omitting a factor variable from test formula excludes its dummies."""
+ model = _power_model(
+ "y = x1 + x2",
+ "x1=0.5, x2[2]=0.3, x2[3]=0.4",
+ variable_types="x2=(factor,3)",
+ )
+ result = _run_power(model, 150, test_formula="y = x1")
+
+ powers = _individual_powers(result)
+ assert "x1" in powers
+ # Factor dummies should not be in results
+ assert "x2[2]" not in powers
+ assert "x2[3]" not in powers
+
+ def test_factor_kept_continuous_dropped(self):
+ """Keeping factor but dropping continuous variable."""
+ model = _power_model(
+ "y = x1 + x2",
+ "x1=0.5, x2[2]=0.3, x2[3]=0.4",
+ variable_types="x2=(factor,3)",
+ )
+ result = _run_power(model, 150, test_formula="y = x2")
+
+ powers = _individual_powers(result)
+ # x1 excluded
+ assert "x1" not in powers
+ # Factor dummies should be present
+ assert "x2[2]" in powers
+ assert "x2[3]" in powers
+
+
+# ===========================================================================
+# Class 3: TestCorrelationStructures
+# ===========================================================================
+
+
+class TestCorrelationStructures:
+ """Test test_formula with correlated predictors."""
+
+ def test_correlated_variables_subset(self):
+ """Subsetting correlated variables runs without error."""
+ model = _power_model(
+ "y = x1 + x2",
+ "x1=0.5, x2=0.3",
+ correlations="(x1,x2)=0.5",
+ )
+ result = _run_power(model, 100, test_formula="y = x1")
+
+ assert result is not None
+ powers = _individual_powers(result)
+ assert "x1" in powers
+ assert "x2" not in powers
+
+
+# ===========================================================================
+# Class 4: TestResultsStructure
+# ===========================================================================
+
+
+class TestResultsStructure:
+ """Test that result dict contains correct test_formula metadata."""
+
+ def test_results_contain_both_formulas(self):
+ """Result should have data_formula and test_formula fields."""
+ model = _power_model(
+ "y = x1 + x2 + x3",
+ "x1=0.5, x2=0.3, x3=0.2",
+ )
+ result = _run_power(model, 100, test_formula="y = x1 + x2")
+
+ assert "data_formula" in result["model"]
+ assert "test_formula" in result["model"]
+ # data_formula should be the generation formula
+ assert "x3" in result["model"]["data_formula"]
+ # test_formula should be the reduced formula
+ assert result["model"]["test_formula"] == "y = x1 + x2"
+
+ def test_target_tests_reflect_test_formula(self):
+ """target_tests in results should not contain excluded effects."""
+ model = _power_model(
+ "y = x1 + x2 + x3",
+ "x1=0.5, x2=0.3, x3=0.2",
+ )
+ result = _run_power(model, 100, test_formula="y = x1 + x2")
+
+ target_tests = result["model"]["target_tests"]
+ assert "x1" in target_tests
+ assert "x2" in target_tests
+ assert "x3" not in target_tests
+
+
+# ===========================================================================
+# Class 5: TestValidation
+# ===========================================================================
+
+
+class TestValidation:
+ """Test validation errors for invalid test_formula usage."""
+
+ def test_nonexistent_variable_raises(self):
+ """test_formula with unknown variable raises ValueError."""
+ model = _power_model(
+ "y = x1 + x2",
+ "x1=0.5, x2=0.3",
+ )
+ with pytest.raises(ValueError, match="not found"):
+ _run_power(model, 100, test_formula="y = x1 + x99")
+
+ def test_ols_to_lme_raises(self):
+ """test_formula with random effects on OLS model raises ValueError.
+
+ When the grouping variable (school) is not in the generation model,
+ validation fails with 'not found'. When it is present but has no
+ cluster config, it fails with 'random effects'.
+ """
+ # Case 1: grouping var not in model at all -> "not found"
+ model = _power_model(
+ "y = x1 + x2",
+ "x1=0.5, x2=0.3",
+ )
+ with pytest.raises(ValueError, match="not found"):
+ _run_power(model, 100, test_formula="y = x1 + (1|school)")
+
+ def test_ols_with_cluster_var_but_no_cluster_config_raises(self):
+ """test_formula with random effects when var exists but no cluster config.
+
+ When the generation model knows about 'school' as a variable but has
+ no cluster specification, the random effects check triggers.
+ """
+ # This would require a model that has 'school' as a predictor but
+ # no set_cluster call. The generation model includes school as a
+ # fixed effect, so it's a known variable.
+ model = _power_model(
+ "y = x1 + school",
+ "x1=0.5, school=0.3",
+ )
+ with pytest.raises(ValueError, match="random effects"):
+ _run_power(model, 100, test_formula="y = x1 + (1|school)")
+
+
+# ===========================================================================
+# Class 6: TestFindSampleSize
+# ===========================================================================
+
+
+class TestFindSampleSize:
+ """Test test_formula with find_sample_size."""
+
+ def test_subset_via_find_sample_size(self):
+ """find_sample_size with test_formula excludes omitted variable."""
+ model = _power_model(
+ "y = x1 + x2 + x3",
+ "x1=0.5, x2=0.3, x3=0.2",
+ )
+ result = model.find_sample_size(
+ target_test="x1",
+ from_size=30,
+ to_size=100,
+ by=10,
+ test_formula="y = x1 + x2",
+ print_results=False,
+ return_results=True,
+ progress_callback=False,
+ )
+
+ assert result is not None
+ powers_by_test = result["results"]["powers_by_test"]
+ assert "x1" in powers_by_test
+ assert "x3" not in powers_by_test
+
+
+# ===========================================================================
+# Class 7: TestMixedModelCross (LME)
+# ===========================================================================
+
+
+@pytest.mark.lme
+class TestMixedModelCross:
+ """Test test_formula across mixed model boundaries."""
+
+ def test_lme_gen_ols_test(self):
+ """Generate with LME, test with OLS (drop random effects)."""
+ model = _power_model(
+ "y ~ x1 + x2 + (1|school)",
+ "x1=0.5, x2=0.3",
+ cluster={"grouping_var": "school", "ICC": 0.2, "n_clusters": 20},
+ max_failed=0.10,
+ )
+ result = _run_power(model, 1000, test_formula="y ~ x1 + x2")
+
+ powers = _individual_powers(result)
+ assert "x1" in powers
+ assert "x2" in powers
+
+ def test_lme_gen_lme_subset(self):
+ """Generate with LME full model, test with LME subset (drop x2)."""
+ model = _power_model(
+ "y ~ x1 + x2 + (1|school)",
+ "x1=0.5, x2=0.3",
+ cluster={"grouping_var": "school", "ICC": 0.2, "n_clusters": 20},
+ max_failed=0.10,
+ )
+ result = _run_power(model, 1000, test_formula="y ~ x1 + (1|school)")
+
+ powers = _individual_powers(result)
+ assert "x1" in powers
+ assert "x2" not in powers
+
+
+# ===========================================================================
+# Class 8: TestUploadedData
+# ===========================================================================
+
+
+class TestUploadedData:
+ """Test test_formula with uploaded empirical data."""
+
+ def test_upload_with_test_formula(self):
+ """Uploaded data with test_formula excludes omitted variable."""
+ np.random.seed(SEED)
+ data = pd.DataFrame({
+ "x1": np.random.normal(0, 1, 50),
+ "x2": np.random.normal(0, 1, 50),
+ "x3": np.random.normal(0, 1, 50),
+ })
+
+ model = _power_model(
+ "y = x1 + x2 + x3",
+ "x1=0.5, x2=0.3, x3=0.2",
+ upload_data=data,
+ )
+ result = _run_power(model, 100, test_formula="y = x1 + x2")
+
+ powers = _individual_powers(result)
+ assert "x1" in powers
+ assert "x2" in powers
+ assert "x3" not in powers
diff --git a/tests/integration/test_upload_data.py b/tests/integration/test_upload_data.py
index 93a7d8f..05602f0 100644
--- a/tests/integration/test_upload_data.py
+++ b/tests/integration/test_upload_data.py
@@ -71,7 +71,7 @@ def test_binary_auto_detection(self, cars_data):
model = MCPower("mpg = vs + am")
model.upload_data(_select(cars_data, ["vs", "am"]))
model.set_effects("vs=0.3, am=0.4")
- model.apply()
+ model._apply()
# Check that vs and am were detected as uploaded_binary
vs_pred = model._registry.get_predictor("vs")
@@ -85,7 +85,7 @@ def test_factor_auto_detection(self, cars_data):
model = MCPower("mpg = cyl + gear")
model.upload_data(_select(cars_data, ["cyl", "gear"]), preserve_factor_level_names=False)
model.set_effects("cyl[2]=0.3, cyl[3]=0.4, gear[2]=0.2, gear[3]=0.3")
- model.apply()
+ model._apply()
# Check that cyl and gear were detected as factor
# After expansion, check the factor names
@@ -103,7 +103,7 @@ def test_continuous_auto_detection(self, cars_data):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]))
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
# Check that hp and wt were detected as continuous (uploaded_data)
hp_pred = model._registry.get_predictor("hp")
@@ -122,14 +122,14 @@ def test_constant_column_dropped(self, cars_data):
# Should raise error because 'constant' will be dropped
with pytest.raises(ValueError, match="All uploaded columns were dropped"):
model.upload_data(_select(data, ["constant"]))
- model.apply()
+ model._apply()
def test_mixed_types_auto_detection(self, cars_data):
"""Test auto-detection with mixed variable types."""
model = MCPower("mpg = vs + cyl + hp")
model.upload_data(_select(cars_data, ["vs", "cyl", "hp"]), preserve_factor_level_names=False)
model.set_effects("vs=0.3, cyl[2]=0.2, cyl[3]=0.4, hp=0.5")
- model.apply()
+ model._apply()
vs_pred = model._registry.get_predictor("vs")
hp_pred = model._registry.get_predictor("hp")
@@ -147,7 +147,7 @@ def test_override_to_continuous(self, cars_data):
model = MCPower("mpg = cyl + hp")
model.upload_data(_select(cars_data, ["cyl", "hp"]), data_types={"cyl": "continuous"})
model.set_effects("cyl=0.4, hp=0.5")
- model.apply()
+ model._apply()
cyl_pred = model._registry.get_predictor("cyl")
# Should be uploaded_data (continuous) instead of factor
@@ -178,7 +178,7 @@ def test_override_to_binary(self, cars_data):
model_binary = MCPower("mpg = hp_binary + wt")
model_binary.upload_data(data, data_types={"hp_binary": "binary"})
model_binary.set_effects("hp_binary=0.4, wt=0.3")
- model_binary.apply()
+ model_binary._apply()
hp_pred = model_binary._registry.get_predictor("hp_binary")
assert hp_pred.var_type == "uploaded_binary"
@@ -206,7 +206,7 @@ def test_no_correlation_from_data(self, cars_data):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]), preserve_correlation="no")
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
# Correlation matrix should be identity (or user-specified)
corr = model.correlation_matrix
@@ -219,7 +219,7 @@ def test_binary_uses_standard_generation(self, cars_data):
model = MCPower("mpg = vs + am")
model.upload_data(_select(cars_data, ["vs", "am"]), preserve_correlation="no")
model.set_effects("vs=0.3, am=0.4")
- model.apply()
+ model._apply()
# Should detect proportions from data
vs_pred = model._registry.get_predictor("vs")
@@ -231,7 +231,7 @@ def test_continuous_uses_lookup_tables(self, cars_data):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]), preserve_correlation="no")
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
# Should have lookup tables populated
assert model.upload_normal_values.shape[0] > 0
@@ -246,7 +246,7 @@ def test_strict_is_default(self, cars_data):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]))
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
assert model._preserve_correlation == "strict"
@@ -255,7 +255,7 @@ def test_correlations_computed_from_data(self, cars_data):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]), preserve_correlation="partial")
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
# Correlation should match data correlation
hp_arr = np.array(cars_data["hp"])
@@ -278,7 +278,7 @@ def test_user_can_override_correlations(self, cars_data):
# This tests that user correlations can override data correlations
# For now, the implementation always uses data correlations
# TODO: Implement user override priority
- model.apply()
+ model._apply()
# Just verify it doesn't crash
assert model.correlation_matrix is not None
@@ -292,7 +292,7 @@ def test_strict_mode_sets_metadata(self, cars_data):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]), preserve_correlation="strict")
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
assert model._preserve_correlation == "strict"
assert model._uploaded_raw_data is not None
@@ -303,7 +303,7 @@ def test_strict_mode_warns_cross_correlations(self, cars_data, capsys):
model = MCPower("mpg = hp + wt + x1") # x1 is created, hp/wt uploaded
model.upload_data(_select(cars_data, ["hp", "wt"]), preserve_correlation="strict")
model.set_effects("hp=0.5, wt=0.3, x1=0.4")
- model.apply()
+ model._apply()
captured = capsys.readouterr()
# Should warn about cross-correlations
@@ -314,7 +314,7 @@ def test_strict_mode_bootstrap_preserves_relationships(self, cars_data):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]), preserve_correlation="strict")
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
# Should be able to run simulation without error
result = model.find_power(sample_size=50, print_results=False, return_results=True)
@@ -356,7 +356,7 @@ def test_strict_mode_with_binary(self, cars_data):
model = MCPower("mpg = vs + am")
model.upload_data(_select(cars_data, ["vs", "am"]), preserve_correlation="strict")
model.set_effects("vs=0.3, am=0.4")
- model.apply()
+ model._apply()
# Check metadata
assert "vs" in model._uploaded_var_metadata
@@ -373,7 +373,7 @@ def test_strict_mode_with_factor(self, cars_data):
preserve_factor_level_names=False,
)
model.set_effects("cyl[2]=0.3, cyl[3]=0.4, gear[2]=0.2, gear[3]=0.3")
- model.apply()
+ model._apply()
# Check metadata
assert "cyl" in model._uploaded_var_metadata
@@ -390,7 +390,7 @@ def test_warning_for_unmatched_columns(self, cars_data, capsys):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt", "vs"])) # vs not in model
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
captured = capsys.readouterr()
assert "Ignoring unmatched columns" in captured.out
@@ -401,7 +401,7 @@ def test_warning_for_large_sample_size(self, cars_data, capsys):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]))
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
# 32 samples * 3 = 96, so 100 should trigger warning
model.find_power(sample_size=100, print_results=False)
@@ -424,7 +424,7 @@ def test_warning_for_dropped_constant_columns(self, cars_data, capsys):
# This should raise an error because constant was dropped and no effect was set for it
# But the auto-detection output should show it was dropped
try:
- model.apply()
+ model._apply()
except ValueError:
pass # Expected to fail because constant column missing
@@ -442,7 +442,7 @@ def test_full_dict_with_unmatched_columns(self, cars_data):
model = MCPower("mpg = hp + wt")
model.upload_data(cars_data) # Full dict, not pre-filtered
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
hp_pred = model._registry.get_predictor("hp")
wt_pred = model._registry.get_predictor("wt")
@@ -463,7 +463,7 @@ def test_full_dict_with_mixed_var_types(self, cars_data):
model = MCPower("mpg = vs + cyl + hp")
model.upload_data(cars_data, preserve_factor_level_names=False) # Full dict
model.set_effects("vs=0.3, cyl[2]=0.2, cyl[3]=0.4, hp=0.5")
- model.apply()
+ model._apply()
vs_pred = model._registry.get_predictor("vs")
hp_pred = model._registry.get_predictor("hp")
@@ -493,7 +493,7 @@ def test_string_matched_column_auto_detected_as_factor(self):
model = MCPower("y = x")
model.upload_data(data)
model.set_effects("x[b]=0.3, x[c]=0.4")
- model.apply()
+ model._apply()
assert "x" in model._registry.factor_names
assert "x[b]" in model._registry.dummy_names
assert "x[c]" in model._registry.dummy_names
@@ -507,7 +507,7 @@ def test_no_matching_columns_ignores_data(self, cars_data, capsys):
model = MCPower("mpg = x1 + x2")
model.upload_data(_select(cars_data, ["hp", "wt"]))
model.set_effects("x1=0.3, x2=0.4")
- model.apply()
+ model._apply()
captured = capsys.readouterr()
assert "uploaded data ignored" in captured.out.lower()
@@ -557,7 +557,7 @@ def test_dict_format(self, cars_data):
}
model.upload_data(data_dict)
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
assert model._applied is True
@@ -566,10 +566,10 @@ def test_sample_size_warning_in_find_sample_size(self, cars_data, capsys):
model = MCPower("mpg = hp + wt")
model.upload_data(_select(cars_data, ["hp", "wt"]))
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
- # 32 * 3 = 96, so to_size=150 should trigger warning
- model.find_sample_size(from_size=30, to_size=150, by=20, print_results=False)
+ # 32 * 3 = 96, so size=110 > 96 triggers warning
+ model.find_sample_size(from_size=50, to_size=110, by=30, print_results=False)
captured = capsys.readouterr()
assert "Warning" in captured.out
@@ -583,14 +583,14 @@ def test_string_column_auto_detected_as_factor(self, cars_data):
model = MCPower("mpg = origin + hp")
model.upload_data(_select(cars_data, ["origin", "hp"]))
model.set_effects("origin[Japan]=0.3, origin[USA]=0.4, hp=0.5")
- model.apply()
+ model._apply()
assert "origin" in model._registry.factor_names
def test_string_column_creates_named_dummies(self, cars_data):
model = MCPower("mpg = origin + hp")
model.upload_data(_select(cars_data, ["origin", "hp"]))
model.set_effects("origin[Japan]=0.3, origin[USA]=0.4, hp=0.5")
- model.apply()
+ model._apply()
dummy_names = model._registry.dummy_names
assert "origin[Japan]" in dummy_names
assert "origin[USA]" in dummy_names
@@ -600,7 +600,7 @@ def test_string_column_no_mode(self, cars_data):
model = MCPower("mpg = origin + hp")
model.upload_data(_select(cars_data, ["origin", "hp"]), preserve_correlation="no")
model.set_effects("origin[Japan]=0.3, origin[USA]=0.4, hp=0.5")
- model.apply()
+ model._apply()
assert "origin" in model._registry.factor_names
def test_too_many_string_levels_raises(self):
@@ -611,7 +611,7 @@ def test_too_many_string_levels_raises(self):
model = MCPower("y = name + x1")
with pytest.raises(ValueError, match="too many unique"):
model.upload_data(_select(data, ["name", "x1"]))
- model.apply()
+ model._apply()
class TestPreserveFactorLevelNames:
@@ -621,7 +621,7 @@ def test_numeric_factor_uses_original_values(self, cars_data):
model = MCPower("mpg = cyl + hp")
model.upload_data(_select(cars_data, ["cyl", "hp"]))
model.set_effects("cyl[6]=0.3, cyl[8]=0.4, hp=0.5")
- model.apply()
+ model._apply()
dummy_names = model._registry.dummy_names
assert "cyl[6]" in dummy_names
assert "cyl[8]" in dummy_names
@@ -631,7 +631,7 @@ def test_preserve_false_uses_integer_indices(self, cars_data):
model = MCPower("mpg = cyl + hp")
model.upload_data(_select(cars_data, ["cyl", "hp"]), preserve_factor_level_names=False)
model.set_effects("cyl[2]=0.3, cyl[3]=0.4, hp=0.5")
- model.apply()
+ model._apply()
dummy_names = model._registry.dummy_names
assert "cyl[2]" in dummy_names
assert "cyl[3]" in dummy_names
@@ -640,7 +640,7 @@ def test_custom_reference_via_data_types_tuple(self, cars_data):
model = MCPower("mpg = cyl + hp")
model.upload_data(_select(cars_data, ["cyl", "hp"]), data_types={"cyl": ("factor", 6)})
model.set_effects("cyl[4]=0.3, cyl[8]=0.4, hp=0.5")
- model.apply()
+ model._apply()
dummy_names = model._registry.dummy_names
assert "cyl[4]" in dummy_names
assert "cyl[8]" in dummy_names
@@ -650,7 +650,7 @@ def test_invalid_reference_level_raises(self, cars_data):
model = MCPower("mpg = cyl + hp")
with pytest.raises(ValueError, match="not found in"):
model.upload_data(_select(cars_data, ["cyl", "hp"]), data_types={"cyl": ("factor", 99)})
- model.apply()
+ model._apply()
def test_string_custom_reference(self, cars_data):
model = MCPower("mpg = origin + hp")
@@ -658,7 +658,7 @@ def test_string_custom_reference(self, cars_data):
_select(cars_data, ["origin", "hp"]), data_types={"origin": ("factor", "Japan")}
)
model.set_effects("origin[Europe]=0.3, origin[USA]=0.4, hp=0.5")
- model.apply()
+ model._apply()
dummy_names = model._registry.dummy_names
assert "origin[Europe]" in dummy_names
assert "origin[USA]" in dummy_names
@@ -737,7 +737,7 @@ def test_origin_as_factor(self, cars_data):
model = MCPower("mpg = origin + hp")
model.upload_data(_select(cars_data, ["origin", "hp"]))
model.set_effects("origin[Japan]=0.3, origin[USA]=0.5, hp=0.4")
- model.apply()
+ model._apply()
assert "origin" in model._registry.factor_names
assert "origin[Japan]" in model._registry.dummy_names
@@ -762,7 +762,7 @@ def test_origin_with_cyl_mixed(self, cars_data):
model = MCPower("mpg = origin + cyl")
model.upload_data(_select(cars_data, ["origin", "cyl"]))
model.set_effects("origin[Japan]=0.3, origin[USA]=0.5, cyl[6]=0.2, cyl[8]=0.4")
- model.apply()
+ model._apply()
assert "origin[Japan]" in model._registry.dummy_names
assert "cyl[6]" in model._registry.dummy_names
@@ -822,7 +822,7 @@ def test_dataframe_upload(self):
model = MCPower("mpg = hp + wt")
model.upload_data(df[["hp", "wt"]])
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
hp_pred = model._registry.get_predictor("hp")
assert hp_pred.var_type == "uploaded_data"
@@ -833,7 +833,7 @@ def test_dataframe_with_string_index_column(self):
model = MCPower("mpg = hp + wt")
model.upload_data(df)
model.set_effects("hp=0.5, wt=0.3")
- model.apply()
+ model._apply()
hp_pred = model._registry.get_predictor("hp")
assert hp_pred.var_type == "uploaded_data"
diff --git a/tests/mixed_models/test_cluster_validators.py b/tests/mixed_models/test_cluster_validators.py
index 0cca25e..27540d2 100644
--- a/tests/mixed_models/test_cluster_validators.py
+++ b/tests/mixed_models/test_cluster_validators.py
@@ -102,7 +102,7 @@ def test_sufficient_observations_per_cluster(self):
model.set_cluster("cluster", ICC=0.2, n_clusters=5)
model.set_effects("x=0.5")
model.set_simulations(10)
- model.apply()
+ model._apply()
# 50 / 5 = 10 (above warning band)
result = model.find_power(sample_size=50, return_results=True)
@@ -118,7 +118,7 @@ def test_insufficient_observations_per_cluster_rejected(self):
model.set_cluster("cluster", ICC=0.2, n_clusters=5)
model.set_effects("x=0.5")
model.set_simulations(10)
- model.apply()
+ model._apply()
# 20 / 5 = 4 (below minimum)
with pytest.raises(ValueError, match="Insufficient observations per cluster"):
@@ -134,7 +134,7 @@ def test_validation_message_suggestions(self):
model.set_cluster("cluster", ICC=0.2, n_clusters=10)
model.set_effects("x=0.5")
model.set_simulations(10)
- model.apply()
+ model._apply()
with pytest.raises(ValueError) as exc_info:
model.find_power(sample_size=30) # 30/10 = 3 < 5
@@ -155,7 +155,7 @@ def test_valid_config_runs_successfully(self):
model.set_cluster("cluster", ICC=0.2, n_clusters=5)
model.set_effects("x=0.5")
model.set_simulations(10)
- model.apply()
+ model._apply()
result = model.find_power(sample_size=50, return_results=True) # 10 per cluster
@@ -170,7 +170,7 @@ def test_edge_case_exactly_5_per_cluster(self):
model.set_effects("x=0.5")
model.set_simulations(10)
model.set_max_failed_simulations(0.30) # Allow more failures at edge
- model.apply()
+ model._apply()
result = model.find_power(sample_size=20, return_results=True) # 20/4 = 5
@@ -182,7 +182,7 @@ def test_icc_zero_no_convergence_issues(self):
model.set_cluster("cluster", ICC=0.0, n_clusters=5)
model.set_effects("x=0.5")
model.set_simulations(20)
- model.apply()
+ model._apply()
result = model.find_power(sample_size=250, return_results=True)
diff --git a/tests/mixed_models/test_integration_phase2.py b/tests/mixed_models/test_integration_phase2.py
index 42d8621..0d04919 100644
--- a/tests/mixed_models/test_integration_phase2.py
+++ b/tests/mixed_models/test_integration_phase2.py
@@ -23,7 +23,7 @@ def test_slope_model_setup(self):
slope_intercept_corr=0.3,
)
model.set_effects("x1=0.5")
- model.apply()
+ model._apply()
# Verify cluster spec was configured correctly
spec = model._registry._cluster_specs["school"]
@@ -106,7 +106,7 @@ def test_nested_model_setup(self):
model.set_cluster("school", ICC=0.15, n_clusters=10)
model.set_cluster("classroom", ICC=0.10, n_per_parent=3)
model.set_effects("treatment=0.5")
- model.apply()
+ model._apply()
assert "school" in model._registry._cluster_specs
assert "school:classroom" in model._registry._cluster_specs
diff --git a/tests/mixed_models/test_mixed_models.py b/tests/mixed_models/test_mixed_models.py
index de4b422..e0a867c 100644
--- a/tests/mixed_models/test_mixed_models.py
+++ b/tests/mixed_models/test_mixed_models.py
@@ -405,7 +405,6 @@ def test_unknown_backend_raises(self):
np.zeros(10),
np.array([0]),
np.zeros(10, dtype=int),
- [],
0,
0.05,
backend="nonexistent",
diff --git a/tests/mixed_models/test_mixed_models_validation.py b/tests/mixed_models/test_mixed_models_validation.py
index fbe2ecf..5502bfe 100644
--- a/tests/mixed_models/test_mixed_models_validation.py
+++ b/tests/mixed_models/test_mixed_models_validation.py
@@ -90,7 +90,7 @@ def test_icc_recovery_medium(self):
from mcpower.stats.data_generation import _generate_cluster_effects
- sample_size = 500
+ sample_size = 1000
n_clusters = 20
icc_target = ICC_MODERATE_HIGH
@@ -270,7 +270,7 @@ def test_diagnostics_available(self):
y=y,
target_indices=np.array([0]),
cluster_ids=cluster_ids,
- cluster_column_indices=[],
+
correction_method=0,
alpha=0.05,
backend="statsmodels",
diff --git a/tests/mixed_models/test_scenarios_lme.py b/tests/mixed_models/test_scenarios_lme.py
index f3d5ad6..41597d3 100644
--- a/tests/mixed_models/test_scenarios_lme.py
+++ b/tests/mixed_models/test_scenarios_lme.py
@@ -14,7 +14,6 @@
from mcpower.core.scenarios import (
DEFAULT_SCENARIO_CONFIG,
apply_lme_perturbations,
- apply_lme_residual_perturbations,
)
from mcpower.stats.data_generation import (
_generate_cluster_effects,
@@ -35,7 +34,7 @@ class TestDefaultConfig:
"random_effect_dist",
"random_effect_df",
"icc_noise_sd",
- "residual_dist",
+ "residual_dists",
"residual_change_prob",
"residual_df",
]
@@ -49,22 +48,37 @@ def test_doomer_has_lme_keys(self):
assert key in DEFAULT_SCENARIO_CONFIG["doomer"], f"Missing key: {key}"
def test_realistic_values(self):
+ """Realistic scenario has non-zero LME perturbation values."""
cfg = DEFAULT_SCENARIO_CONFIG["realistic"]
assert cfg["random_effect_dist"] == "heavy_tailed"
- assert cfg["random_effect_df"] == 5
- assert cfg["icc_noise_sd"] == 0.15
- assert cfg["residual_dist"] == "heavy_tailed"
- assert cfg["residual_change_prob"] == 0.3
- assert cfg["residual_df"] == 10
+ assert cfg["random_effect_df"] > 0
+ assert cfg["icc_noise_sd"] > 0
+ assert cfg["residual_dists"] == ["heavy_tailed", "skewed"]
+ assert cfg["residual_change_prob"] > 0
+ assert cfg["residual_df"] > 2
def test_doomer_values(self):
- cfg = DEFAULT_SCENARIO_CONFIG["doomer"]
- assert cfg["random_effect_dist"] == "heavy_tailed"
- assert cfg["random_effect_df"] == 3
- assert cfg["icc_noise_sd"] == 0.30
- assert cfg["residual_dist"] == "heavy_tailed"
- assert cfg["residual_change_prob"] == 0.8
- assert cfg["residual_df"] == 5
+ """Doomer scenario has more severe perturbation than realistic."""
+ real = DEFAULT_SCENARIO_CONFIG["realistic"]
+ doom = DEFAULT_SCENARIO_CONFIG["doomer"]
+ assert doom["random_effect_dist"] == "heavy_tailed"
+ assert doom["random_effect_df"] <= real["random_effect_df"]
+ assert doom["icc_noise_sd"] >= real["icc_noise_sd"]
+ assert doom["residual_dists"] == ["heavy_tailed", "skewed"]
+ assert doom["residual_change_prob"] >= real["residual_change_prob"]
+ assert doom["residual_df"] <= real["residual_df"]
+
+ def test_optimistic_has_lme_keys(self):
+ for key in self.LME_KEYS:
+ assert key in DEFAULT_SCENARIO_CONFIG["optimistic"], f"Missing key: {key}"
+
+ def test_optimistic_values_are_zero(self):
+ cfg = DEFAULT_SCENARIO_CONFIG["optimistic"]
+ assert cfg["heterogeneity"] == 0.0
+ assert cfg["heteroskedasticity"] == 0.0
+ assert cfg["residual_change_prob"] == 0.0
+ assert cfg["icc_noise_sd"] == 0.0
+ assert cfg["random_effect_dist"] == "normal"
# ---------------------------------------------------------------------------
@@ -309,73 +323,3 @@ def test_slopes_without_perturbations(self):
assert result.intercept_columns.shape == (1000, 1)
-# ---------------------------------------------------------------------------
-# apply_lme_residual_perturbations
-# ---------------------------------------------------------------------------
-class TestApplyLmeResidualPerturbations:
- """Test apply_lme_residual_perturbations() function."""
-
- def _make_y(self, seed=42):
- """Generate a deterministic y vector with known errors."""
- rng = np.random.RandomState(seed + 2)
- return rng.standard_normal(500)
-
- def test_normal_dist_returns_unchanged(self):
- y = self._make_y()
- config = {"residual_dist": "normal", "residual_change_prob": 1.0, "residual_df": 5}
- result = apply_lme_residual_perturbations(y.copy(), config, 42)
- np.testing.assert_array_equal(result, y)
-
- def test_zero_prob_returns_unchanged(self):
- y = self._make_y()
- config = {"residual_dist": "heavy_tailed", "residual_change_prob": 0.0, "residual_df": 5}
- result = apply_lme_residual_perturbations(y.copy(), config, 42)
- np.testing.assert_array_equal(result, y)
-
- def test_prob_1_always_applies(self):
- y = self._make_y()
- config = {"residual_dist": "heavy_tailed", "residual_change_prob": 1.0, "residual_df": 5}
- result = apply_lme_residual_perturbations(y.copy(), config, 42)
- # Should be different from original
- assert not np.array_equal(result, y)
-
- def test_heavy_tailed_residuals_have_excess_kurtosis(self):
- """When residuals are replaced with t(5), the diff should have heavy tails."""
- y_orig = self._make_y()
- config = {"residual_dist": "heavy_tailed", "residual_change_prob": 1.0, "residual_df": 5}
- y_perturbed = apply_lme_residual_perturbations(y_orig.copy(), config, 42)
- diff = y_perturbed - y_orig
- # The diff = new_errors - original_errors. Both have finite variance,
- # but the new_errors are t(5) which has excess kurtosis.
- # For large enough N, the kurtosis of the difference should be positive.
- sp_stats.kurtosis(diff + y_orig, fisher=True)
- # Just check it ran without error and output differs
- assert not np.array_equal(y_perturbed, y_orig)
-
- def test_skewed_residuals_applied(self):
- y_orig = self._make_y()
- config = {"residual_dist": "skewed", "residual_change_prob": 1.0, "residual_df": 5}
- y_perturbed = apply_lme_residual_perturbations(y_orig.copy(), config, 42)
- assert not np.array_equal(y_perturbed, y_orig)
-
- def test_coin_flip_seed_reproducible(self):
- y = self._make_y()
- config = {"residual_dist": "heavy_tailed", "residual_change_prob": 0.5, "residual_df": 5}
- r1 = apply_lme_residual_perturbations(y.copy(), config, 42)
- r2 = apply_lme_residual_perturbations(y.copy(), config, 42)
- np.testing.assert_array_equal(r1, r2)
-
- def test_coin_flip_prob_respected(self):
- """With prob=0.3, roughly 30% of simulations should be perturbed."""
- config = {"residual_dist": "heavy_tailed", "residual_change_prob": 0.3, "residual_df": 5}
- n_perturbed = 0
- n_trials = 200
- y_template = np.ones(100)
- for i in range(n_trials):
- y = y_template.copy()
- result = apply_lme_residual_perturbations(y, config, i * 100)
- if not np.array_equal(result, y_template):
- n_perturbed += 1
- # Should be roughly 30% ± some tolerance
- pct = n_perturbed / n_trials
- assert 0.10 < pct < 0.55, f"Expected ~30% perturbed, got {pct:.1%}"
diff --git a/tests/specs/test_alpha_levels.py b/tests/specs/test_alpha_levels.py
index 529e9db..c1f0908 100644
--- a/tests/specs/test_alpha_levels.py
+++ b/tests/specs/test_alpha_levels.py
@@ -1,9 +1,8 @@
"""
-Non-default alpha level tests — backend-agnostic.
+Non-default alpha level tests.
Validates that the full alpha pipeline (power accuracy, corrections,
null calibration) works correctly at alpha != 0.05.
-Tests run on ALL available backends via the backend fixture.
"""
import contextlib
@@ -12,7 +11,7 @@
import numpy as np
import pytest
-from tests.config import N_SIMS, SEED
+from tests.config import N_SIMS, N_SIMS_ORDERING, N_SIMS_STANDARD, SEED
from tests.helpers.analytical import analytical_f_power, analytical_t_power
from tests.helpers.mc_margins import mc_accuracy_margin, mc_margin
from tests.helpers.power_helpers import get_power, get_power_corrected, make_null_model
@@ -44,7 +43,7 @@ class TestAlphaAccuracyVsAnalytical:
(0.5, 100),
],
)
- def test_single_predictor_t_test_alpha(self, backend, alpha, beta, n):
+ def test_single_predictor_t_test_alpha(self, alpha, beta, n):
"""t-test power matches analytical non-central t at non-default alpha."""
from mcpower import MCPower
@@ -63,7 +62,7 @@ def test_single_predictor_t_test_alpha(self, backend, alpha, beta, n):
exact_power = analytical_t_power(beta, n, p=1, sigma_eps=1.0, vif_j=1.0, alpha=alpha)
margin = mc_accuracy_margin(exact_power, N_SIMS)
assert abs(mc_power - exact_power) < margin, (
- f"[{backend}] alpha={alpha}, β={beta}, n={n}: MC={mc_power:.2f}%, analytical={exact_power:.2f}% ± {margin:.2f}%"
+ f"alpha={alpha}, β={beta}, n={n}: MC={mc_power:.2f}%, analytical={exact_power:.2f}% ± {margin:.2f}%"
)
@pytest.mark.parametrize("alpha", [0.01, 0.10])
@@ -74,7 +73,7 @@ def test_single_predictor_t_test_alpha(self, backend, alpha, beta, n):
(0.5, 0.3, 80),
],
)
- def test_two_predictors_uncorrelated_alpha(self, backend, alpha, b1, b2, n):
+ def test_two_predictors_uncorrelated_alpha(self, alpha, b1, b2, n):
"""Each t-test and F-test with Σ = I at non-default alpha."""
from mcpower import MCPower
@@ -103,14 +102,14 @@ def test_two_predictors_uncorrelated_alpha(self, backend, alpha, b1, b2, n):
)
margin = mc_accuracy_margin(exact, N_SIMS)
assert abs(mc_power - exact) < margin, (
- f"[{backend}] alpha={alpha}, {var}: MC={mc_power:.2f}%, analytical={exact:.2f}% ± {margin:.2f}%"
+ f"alpha={alpha}, {var}: MC={mc_power:.2f}%, analytical={exact:.2f}% ± {margin:.2f}%"
)
mc_f = get_power(result, "overall")
exact_f = analytical_f_power([b1, b2], n, Sigma, sigma_eps=1.0, alpha=alpha)
margin_f = mc_accuracy_margin(exact_f, N_SIMS)
assert abs(mc_f - exact_f) < margin_f, (
- f"[{backend}] alpha={alpha}, F-test: MC={mc_f:.2f}%, analytical={exact_f:.2f}% ± {margin_f:.2f}%"
+ f"alpha={alpha}, F-test: MC={mc_f:.2f}%, analytical={exact_f:.2f}% ± {margin_f:.2f}%"
)
@pytest.mark.parametrize("alpha", [0.01, 0.10])
@@ -121,7 +120,7 @@ def test_two_predictors_uncorrelated_alpha(self, backend, alpha, b1, b2, n):
(0.5, 0.3, 0.5, 80),
],
)
- def test_two_predictors_correlated_alpha(self, backend, alpha, b1, b2, rho, n):
+ def test_two_predictors_correlated_alpha(self, alpha, b1, b2, rho, n):
"""VIF-corrected t-tests with correlated predictors at non-default alpha."""
from mcpower import MCPower
@@ -154,7 +153,7 @@ def test_two_predictors_correlated_alpha(self, backend, alpha, b1, b2, rho, n):
)
margin = mc_accuracy_margin(exact, N_SIMS)
assert abs(mc_power - exact) < margin, (
- f"[{backend}] alpha={alpha}, rho={rho}, {var}: MC={mc_power:.2f}%, analytical={exact:.2f}% ± {margin:.2f}%"
+ f"alpha={alpha}, rho={rho}, {var}: MC={mc_power:.2f}%, analytical={exact:.2f}% ± {margin:.2f}%"
)
@@ -170,9 +169,9 @@ class TestAlphaCorrectionAccuracy:
@pytest.mark.parametrize("alpha", [0.01, 0.10])
@pytest.mark.parametrize("correction", ["bonferroni", "holm", "fdr"])
- def test_corrected_leq_uncorrected_at_alpha(self, backend, alpha, correction):
+ def test_corrected_leq_uncorrected_at_alpha(self, alpha, correction):
"""Corrected power <= uncorrected power when all effects = 0."""
- m = make_null_model("y = x1 + x2 + x3", n_sims=N_SIMS, alpha=alpha, seed=SEED)
+ m = make_null_model("y = x1 + x2 + x3", n_sims=N_SIMS_ORDERING, alpha=alpha, seed=SEED)
result = m.find_power(
sample_size=100,
target_test="x1, x2, x3",
@@ -184,14 +183,14 @@ def test_corrected_leq_uncorrected_at_alpha(self, backend, alpha, correction):
uncorr = get_power(result, var)
corr = get_power_corrected(result, var)
assert corr <= uncorr + 0.5, (
- f"[{backend}] alpha={alpha}, {correction}: corrected {corr:.2f}% > uncorrected {uncorr:.2f}% for {var}"
+ f"alpha={alpha}, {correction}: corrected {corr:.2f}% > uncorrected {uncorr:.2f}% for {var}"
)
@pytest.mark.parametrize("alpha", [0.01, 0.10])
@pytest.mark.parametrize("correction", ["bonferroni", "holm"])
- def test_fwer_controlled_at_alpha(self, backend, alpha, correction):
+ def test_fwer_controlled_at_alpha(self, alpha, correction):
"""FWER-controlling methods keep per-test rejection below nominal alpha."""
- m = make_null_model("y = x1 + x2 + x3", n_sims=N_SIMS, alpha=alpha, seed=SEED)
+ m = make_null_model("y = x1 + x2 + x3", n_sims=N_SIMS_ORDERING, alpha=alpha, seed=SEED)
result = m.find_power(
sample_size=100,
target_test="x1, x2, x3",
@@ -201,17 +200,17 @@ def test_fwer_controlled_at_alpha(self, backend, alpha, correction):
)
for var in ["x1", "x2", "x3"]:
corr = get_power_corrected(result, var)
- assert corr < alpha * 100 + mc_margin(alpha, N_SIMS), (
- f"[{backend}] alpha={alpha}, {correction} FWER violation for {var}: corrected power = {corr:.2f}%"
+ assert corr < alpha * 100 + mc_margin(alpha, N_SIMS_ORDERING), (
+ f"alpha={alpha}, {correction} FWER violation for {var}: corrected power = {corr:.2f}%"
)
@pytest.mark.parametrize("alpha", [0.01, 0.10])
- def test_bonferroni_more_conservative_than_fdr_at_alpha(self, backend, alpha):
+ def test_bonferroni_more_conservative_than_fdr_at_alpha(self, alpha):
"""Bonferroni should reject <= FDR (BH) under non-null at non-default alpha."""
from mcpower import MCPower
m = MCPower("y = x1 + x2 + x3")
- m.set_simulations(N_SIMS)
+ m.set_simulations(N_SIMS_ORDERING)
m.set_seed(SEED)
m.set_alpha(alpha)
m.set_effects("x1=0.3, x2=0.2, x3=0.1")
@@ -233,7 +232,7 @@ def test_bonferroni_more_conservative_than_fdr_at_alpha(self, backend, alpha):
for var in ["x1", "x2", "x3"]:
bonf = get_power_corrected(result_bonf, var)
fdr = get_power_corrected(result_fdr, var)
- assert bonf <= fdr + 2.0, f"[{backend}] alpha={alpha}: Bonferroni ({bonf:.2f}%) > FDR ({fdr:.2f}%) for {var}"
+ assert bonf <= fdr + 2.0, f"alpha={alpha}: Bonferroni ({bonf:.2f}%) > FDR ({fdr:.2f}%) for {var}"
# ── Class 3: Null calibration at alpha != 0.05 (multi-predictor) ────
@@ -245,29 +244,29 @@ class TestAlphaCalibrationExtended:
to multi-predictor models and corrected rejection under the null.
"""
- @pytest.mark.parametrize("alpha", [0.01, 0.05, 0.10])
- def test_null_rejection_multi_predictor(self, backend, alpha):
+ @pytest.mark.parametrize("alpha", [0.01, 0.10])
+ def test_null_rejection_multi_predictor(self, alpha):
"""Two-predictor null: each t-test and overall F-test reject at ~alpha."""
- m = make_null_model("y = x1 + x2", n_sims=N_SIMS, alpha=alpha, seed=SEED)
+ m = make_null_model("y = x1 + x2", n_sims=N_SIMS_STANDARD, alpha=alpha, seed=SEED)
result = m.find_power(
sample_size=100,
target_test="all",
print_results=False,
return_results=True,
)
- margin = mc_margin(alpha, N_SIMS)
+ margin = mc_margin(alpha, N_SIMS_STANDARD)
expected = alpha * 100
for test_name in ["x1", "x2", "overall"]:
power = get_power(result, test_name)
assert abs(power - expected) < margin, (
- f"[{backend}] alpha={alpha}, {test_name}: observed {power:.2f}%, expected {expected}% ± {margin:.2f}%"
+ f"alpha={alpha}, {test_name}: observed {power:.2f}%, expected {expected}% ± {margin:.2f}%"
)
- @pytest.mark.parametrize("alpha", [0.01, 0.05, 0.10])
+ @pytest.mark.parametrize("alpha", [0.01, 0.10])
@pytest.mark.parametrize("correction", ["bonferroni", "holm"])
- def test_null_rejection_corrected_at_alpha(self, backend, alpha, correction):
+ def test_null_rejection_corrected_at_alpha(self, alpha, correction):
"""Corrected null rejection stays below alpha + MC margin for 3 predictors."""
- m = make_null_model("y = x1 + x2 + x3", n_sims=N_SIMS, alpha=alpha, seed=SEED)
+ m = make_null_model("y = x1 + x2 + x3", n_sims=N_SIMS_STANDARD, alpha=alpha, seed=SEED)
result = m.find_power(
sample_size=100,
target_test="x1, x2, x3",
@@ -275,9 +274,9 @@ def test_null_rejection_corrected_at_alpha(self, backend, alpha, correction):
print_results=False,
return_results=True,
)
- margin = mc_margin(alpha, N_SIMS)
+ margin = mc_margin(alpha, N_SIMS_STANDARD)
for var in ["x1", "x2", "x3"]:
corr = get_power_corrected(result, var)
assert corr < alpha * 100 + margin, (
- f"[{backend}] alpha={alpha}, {correction}, {var}: corrected rejection {corr:.2f}% exceeds {alpha * 100}% + {margin:.2f}%"
+ f"alpha={alpha}, {correction}, {var}: corrected rejection {corr:.2f}% exceeds {alpha * 100}% + {margin:.2f}%"
)
diff --git a/tests/specs/test_corrections.py b/tests/specs/test_corrections.py
index b26b8e1..025aee7 100644
--- a/tests/specs/test_corrections.py
+++ b/tests/specs/test_corrections.py
@@ -1,7 +1,5 @@
"""
-Multiple comparison correction tests — backend-agnostic.
-
-Tests run on ALL available backends via the backend fixture.
+Multiple comparison correction tests.
"""
import contextlib
@@ -9,7 +7,7 @@
import pytest
-from tests.config import N_SIMS, SEED
+from tests.config import N_SIMS_ORDERING as N_SIMS, SEED
from tests.helpers.mc_margins import mc_margin
from tests.helpers.power_helpers import get_power, get_power_corrected, make_null_model
@@ -28,7 +26,7 @@ class TestCorrectionConservativeness:
"""
@pytest.mark.parametrize("correction", ["bonferroni", "holm", "fdr"])
- def test_corrected_leq_uncorrected_under_null(self, backend, correction):
+ def test_corrected_leq_uncorrected_under_null(self, correction):
"""Corrected power ≤ uncorrected power when all effects = 0."""
m = make_null_model("y = x1 + x2 + x3", n_sims=N_SIMS, seed=SEED)
result = m.find_power(
@@ -42,11 +40,11 @@ def test_corrected_leq_uncorrected_under_null(self, backend, correction):
uncorr = get_power(result, var)
corr = get_power_corrected(result, var)
assert corr <= uncorr + 0.5, ( # tiny tolerance for MC noise
- f"[{backend}] {correction}: corrected {corr:.2f}% > uncorrected {uncorr:.2f}% for {var}"
+ f"{correction}: corrected {corr:.2f}% > uncorrected {uncorr:.2f}% for {var}"
)
@pytest.mark.parametrize("correction", ["bonferroni", "holm"])
- def test_fwer_controlled_under_null(self, backend, correction):
+ def test_fwer_controlled_under_null(self, correction):
"""
Family-wise error rate under H0 should be ≤ alpha.
@@ -65,10 +63,10 @@ def test_fwer_controlled_under_null(self, backend, correction):
# Under complete null, FWER-controlling methods should have
# per-test rejection well below the nominal alpha
assert corr < m.alpha * 100 + mc_margin(m.alpha, m.n_simulations), (
- f"[{backend}] {correction} FWER violation for {var}: corrected power = {corr:.2f}%"
+ f"{correction} FWER violation for {var}: corrected power = {corr:.2f}%"
)
- def test_bonferroni_more_conservative_than_fdr(self, backend):
+ def test_bonferroni_more_conservative_than_fdr(self):
"""Bonferroni should reject ≤ FDR (BH) under non-null."""
from mcpower import MCPower
@@ -95,4 +93,4 @@ def test_bonferroni_more_conservative_than_fdr(self, backend):
bonf = get_power_corrected(result_bonf, var)
fdr = get_power_corrected(result_fdr, var)
# Bonferroni ≤ BH-FDR (with MC tolerance)
- assert bonf <= fdr + 2.0, f"[{backend}] Bonferroni ({bonf:.2f}%) > FDR ({fdr:.2f}%) for {var}"
+ assert bonf <= fdr + 2.0, f"Bonferroni ({bonf:.2f}%) > FDR ({fdr:.2f}%) for {var}"
diff --git a/tests/specs/test_monotonicity.py b/tests/specs/test_monotonicity.py
index 4261b99..bac2fe9 100644
--- a/tests/specs/test_monotonicity.py
+++ b/tests/specs/test_monotonicity.py
@@ -1,8 +1,7 @@
"""
-Power monotonicity tests — backend-agnostic.
+Power monotonicity tests.
Power must increase with effect size, sample size, and alpha.
-Tests run on ALL available backends via the backend fixture.
"""
import contextlib
@@ -10,7 +9,7 @@
import pytest
-from tests.config import N_SIMS, SEED
+from tests.config import N_SIMS_ORDERING as N_SIMS, SEED
from tests.helpers.power_helpers import get_power
@@ -24,7 +23,7 @@ def _quiet():
class TestPowerMonotonicity:
"""Power must increase with effect size, sample size, and alpha."""
- def test_power_increases_with_effect_size(self, backend):
+ def test_power_increases_with_effect_size(self):
"""Larger standardised beta → higher power."""
from mcpower import MCPower
@@ -43,9 +42,9 @@ def test_power_increases_with_effect_size(self, backend):
powers.append(get_power(result, "x1"))
for i in range(len(powers) - 1):
- assert powers[i] < powers[i + 1], f"[{backend}] Power not monotonic in effect size: {powers}"
+ assert powers[i] < powers[i + 1], f"Power not monotonic in effect size: {powers}"
- def test_power_increases_with_sample_size(self, backend):
+ def test_power_increases_with_sample_size(self):
"""Larger N → higher power (for non-zero effect)."""
from mcpower import MCPower
@@ -64,9 +63,9 @@ def test_power_increases_with_sample_size(self, backend):
powers.append(get_power(result, "x1"))
for i in range(len(powers) - 1):
- assert powers[i] < powers[i + 1], f"[{backend}] Power not monotonic in N: {powers}"
+ assert powers[i] < powers[i + 1], f"Power not monotonic in N: {powers}"
- def test_power_increases_with_alpha(self, backend):
+ def test_power_increases_with_alpha(self):
"""Less stringent alpha → higher power."""
from mcpower import MCPower
@@ -86,13 +85,13 @@ def test_power_increases_with_alpha(self, backend):
powers.append(get_power(result, "x1"))
for i in range(len(powers) - 1):
- assert powers[i] < powers[i + 1], f"[{backend}] Power not monotonic in alpha: {powers}"
+ assert powers[i] < powers[i + 1], f"Power not monotonic in alpha: {powers}"
class TestPowerConvergence:
"""Power must approach 100% when signal is overwhelming."""
- def test_large_effect_high_power(self, backend):
+ def test_large_effect_high_power(self):
"""Very large effect → power near 100%."""
from mcpower import MCPower
@@ -107,9 +106,9 @@ def test_large_effect_high_power(self, backend):
return_results=True,
)
power = get_power(result, "x1")
- assert power > 99.0, f"[{backend}] Large-effect power should be ~100%, got {power:.2f}%"
+ assert power > 99.0, f"Large-effect power should be ~100%, got {power:.2f}%"
- def test_large_n_moderate_effect(self, backend):
+ def test_large_n_moderate_effect(self):
"""Large N with moderate effect → power near 100%."""
from mcpower import MCPower
@@ -124,4 +123,4 @@ def test_large_n_moderate_effect(self, backend):
return_results=True,
)
power = get_power(result, "x1")
- assert power > 99.0, f"[{backend}] Large-N power should be ~100%, got {power:.2f}%"
+ assert power > 99.0, f"Large-N power should be ~100%, got {power:.2f}%"
diff --git a/tests/specs/test_power_accuracy.py b/tests/specs/test_power_accuracy.py
index 755d0bb..fff7235 100644
--- a/tests/specs/test_power_accuracy.py
+++ b/tests/specs/test_power_accuracy.py
@@ -1,9 +1,8 @@
"""
-Power accuracy tests — backend-agnostic.
+Power accuracy tests.
Compare MC power estimates against exact analytical power from
non-central t / F distributions.
-Tests run on ALL available backends via the backend fixture.
"""
import contextlib
@@ -42,7 +41,7 @@ class TestAccuracyVsAnalytical:
(0.5, 150),
],
)
- def test_single_predictor_t_test(self, backend, beta, n):
+ def test_single_predictor_t_test(self, beta, n):
"""t-test power matches analytical non-central t."""
from mcpower import MCPower
@@ -60,7 +59,7 @@ def test_single_predictor_t_test(self, backend, beta, n):
exact_power = analytical_t_power(beta, n, p=1, sigma_eps=1.0, vif_j=1.0)
margin = mc_accuracy_margin(exact_power, N_SIMS)
assert abs(mc_power - exact_power) < margin, (
- f"[{backend}] β={beta}, n={n}: MC={mc_power:.2f}%, analytical={exact_power:.2f}% ± {margin:.2f}%"
+ f"β={beta}, n={n}: MC={mc_power:.2f}%, analytical={exact_power:.2f}% ± {margin:.2f}%"
)
@pytest.mark.parametrize(
@@ -71,7 +70,7 @@ def test_single_predictor_t_test(self, backend, beta, n):
(0.2, 0.2, 200),
],
)
- def test_two_predictors_uncorrelated(self, backend, b1, b2, n):
+ def test_two_predictors_uncorrelated(self, b1, b2, n):
"""Each t-test and F-test with Σ = I."""
from mcpower import MCPower
@@ -91,12 +90,12 @@ def test_two_predictors_uncorrelated(self, backend, b1, b2, n):
mc_power = get_power(result, var)
exact = analytical_t_power(beta, n, p=2, sigma_eps=1.0, vif_j=1.0)
margin = mc_accuracy_margin(exact, N_SIMS)
- assert abs(mc_power - exact) < margin, f"[{backend}] {var}: MC={mc_power:.2f}%, analytical={exact:.2f}% ± {margin:.2f}%"
+ assert abs(mc_power - exact) < margin, f"{var}: MC={mc_power:.2f}%, analytical={exact:.2f}% ± {margin:.2f}%"
mc_f = get_power(result, "overall")
exact_f = analytical_f_power([b1, b2], n, Sigma, sigma_eps=1.0)
margin_f = mc_accuracy_margin(exact_f, N_SIMS)
- assert abs(mc_f - exact_f) < margin_f, f"[{backend}] F-test: MC={mc_f:.2f}%, analytical={exact_f:.2f}% ± {margin_f:.2f}%"
+ assert abs(mc_f - exact_f) < margin_f, f"F-test: MC={mc_f:.2f}%, analytical={exact_f:.2f}% ± {margin_f:.2f}%"
@pytest.mark.parametrize(
"b1,b2,rho,n",
@@ -107,7 +106,7 @@ def test_two_predictors_uncorrelated(self, backend, b1, b2, n):
(0.5, 0.3, 0.5, 80),
],
)
- def test_two_predictors_correlated_t_tests(self, backend, b1, b2, rho, n):
+ def test_two_predictors_correlated_t_tests(self, b1, b2, rho, n):
"""Individual t-tests with correlated predictors: VIF matters."""
from mcpower import MCPower
@@ -132,5 +131,5 @@ def test_two_predictors_correlated_t_tests(self, backend, b1, b2, rho, n):
exact = analytical_t_power(beta, n, p=2, sigma_eps=1.0, vif_j=vif)
margin = mc_accuracy_margin(exact, N_SIMS)
assert abs(mc_power - exact) < margin, (
- f"[{backend}] rho={rho}, {var}: MC={mc_power:.2f}%, analytical={exact:.2f}% ± {margin:.2f}%"
+ f"rho={rho}, {var}: MC={mc_power:.2f}%, analytical={exact:.2f}% ± {margin:.2f}%"
)
diff --git a/tests/specs/test_type1_error.py b/tests/specs/test_type1_error.py
index c56b4fe..565d80e 100644
--- a/tests/specs/test_type1_error.py
+++ b/tests/specs/test_type1_error.py
@@ -1,8 +1,7 @@
"""
-Type I error control tests — backend-agnostic.
+Type I error control tests.
Under H0 (effect = 0), rejection rate must equal alpha.
-Tests run on ALL available backends via the backend fixture.
"""
import contextlib
@@ -10,7 +9,7 @@
import pytest
-from tests.config import N_SIMS, SEED
+from tests.config import N_SIMS_STANDARD as N_SIMS, SEED
from tests.helpers.mc_margins import mc_margin
from tests.helpers.power_helpers import get_power, make_null_model
@@ -25,7 +24,7 @@ def _quiet():
class TestTypeIErrorControl:
"""Under H0 (effect = 0), rejection rate must equal alpha."""
- def test_single_predictor_null_overall(self, backend):
+ def test_single_predictor_null_overall(self):
"""F-test rejection rate ≈ alpha with one predictor at zero effect."""
m = make_null_model("y = x1", n_sims=N_SIMS, seed=SEED)
result = m.find_power(
@@ -37,9 +36,9 @@ def test_single_predictor_null_overall(self, backend):
power = get_power(result, "overall")
margin = mc_margin(m.alpha, m.n_simulations)
expected = m.alpha * 100
- assert abs(power - expected) < margin, f"[{backend}] F-test power under H0: {power:.2f}%, expected {expected}% ± {margin:.2f}%"
+ assert abs(power - expected) < margin, f"F-test power under H0: {power:.2f}%, expected {expected}% ± {margin:.2f}%"
- def test_single_predictor_null_individual(self, backend):
+ def test_single_predictor_null_individual(self):
"""t-test rejection rate ≈ alpha for a single zero-effect predictor."""
m = make_null_model("y = x1", n_sims=N_SIMS, seed=SEED)
result = m.find_power(
@@ -51,9 +50,9 @@ def test_single_predictor_null_individual(self, backend):
power = get_power(result, "x1")
margin = mc_margin(m.alpha, m.n_simulations)
expected = m.alpha * 100
- assert abs(power - expected) < margin, f"[{backend}] t-test power under H0: {power:.2f}%, expected {expected}% ± {margin:.2f}%"
+ assert abs(power - expected) < margin, f"t-test power under H0: {power:.2f}%, expected {expected}% ± {margin:.2f}%"
- def test_two_predictors_null_each(self, backend):
+ def test_two_predictors_null_each(self):
"""Both predictors at zero → each t-test rejects at ~alpha."""
m = make_null_model("y = x1 + x2", n_sims=N_SIMS, seed=SEED)
result = m.find_power(
@@ -66,9 +65,9 @@ def test_two_predictors_null_each(self, backend):
expected = m.alpha * 100
for var in ["x1", "x2"]:
power = get_power(result, var)
- assert abs(power - expected) < margin, f"[{backend}] {var} power under H0: {power:.2f}%, expected {expected}% ± {margin:.2f}%"
+ assert abs(power - expected) < margin, f"{var} power under H0: {power:.2f}%, expected {expected}% ± {margin:.2f}%"
- def test_large_sample_null(self, backend):
+ def test_large_sample_null(self):
"""
Large N with zero effect must NOT inflate Type I error.
@@ -76,7 +75,7 @@ def test_large_sample_null(self, backend):
"""
m = make_null_model("y = x1", n_sims=N_SIMS, seed=SEED)
result = m.find_power(
- sample_size=1000,
+ sample_size=500,
target_test="x1",
print_results=False,
return_results=True,
@@ -85,7 +84,7 @@ def test_large_sample_null(self, backend):
margin = mc_margin(m.alpha, m.n_simulations)
expected = m.alpha * 100
assert abs(power - expected) < margin, (
- f"[{backend}] Large-N null power: {power:.2f}%, expected {expected}% ± {margin:.2f}% (Type I error inflated with N?)"
+ f"Large-N null power: {power:.2f}%, expected {expected}% ± {margin:.2f}% (Type I error inflated with N?)"
)
@@ -93,7 +92,7 @@ class TestAlphaCalibration:
"""Rejection rate tracks the nominal alpha across levels."""
@pytest.mark.parametrize("alpha", [0.01, 0.05, 0.10])
- def test_null_rejection_matches_alpha(self, backend, alpha):
+ def test_null_rejection_matches_alpha(self, alpha):
m = make_null_model("y = x1", n_sims=N_SIMS, alpha=alpha, seed=SEED)
result = m.find_power(
sample_size=100,
@@ -104,4 +103,4 @@ def test_null_rejection_matches_alpha(self, backend, alpha):
power = get_power(result, "x1")
margin = mc_margin(alpha, m.n_simulations)
expected = alpha * 100
- assert abs(power - expected) < margin, f"[{backend}] alpha={alpha}: observed {power:.2f}%, expected {expected}% ± {margin:.2f}%"
+ assert abs(power - expected) < margin, f"alpha={alpha}: observed {power:.2f}%, expected {expected}% ± {margin:.2f}%"
diff --git a/tests/unit/test_distributions.py b/tests/unit/test_distributions.py
index d3e77f9..693d1ff 100644
--- a/tests/unit/test_distributions.py
+++ b/tests/unit/test_distributions.py
@@ -24,7 +24,6 @@
import pytest
from mcpower.stats.distributions import (
- _BACKEND,
chi2_cdf,
chi2_ppf,
compute_critical_values_lme,
@@ -587,29 +586,6 @@ def test_studentized_range_k_too_large_returns_inf(self):
# ===========================================================================
# 15. Backend detection
-# ===========================================================================
-class TestBackendDetection:
- """Verify the distribution backend is correctly detected."""
-
- def test_backend_is_set(self):
- assert _BACKEND is not None
-
- def test_backend_is_string(self):
- assert isinstance(_BACKEND, str)
-
- def test_backend_is_known_value(self):
- assert _BACKEND in ("native", "scipy")
-
- def test_native_backend_when_compiled(self):
- """When the C++ extension is compiled, backend should be 'native'."""
- try:
- import mcpower.backends.mcpower_native # noqa: F401
-
- assert _BACKEND == "native"
- except ImportError:
- pytest.skip("C++ native backend not compiled")
-
-
# ===========================================================================
# Cross-consistency checks
# ===========================================================================
diff --git a/tests/unit/test_distributions_coverage.py b/tests/unit/test_distributions_coverage.py
new file mode 100644
index 0000000..224c8e6
--- /dev/null
+++ b/tests/unit/test_distributions_coverage.py
@@ -0,0 +1,42 @@
+"""Tests for distributions.py — optimizer functions and edge cases."""
+
+import numpy as np
+import pytest
+
+from mcpower.stats.distributions import minimize_lbfgsb, minimize_scalar_brent
+
+
+class TestOptimizerLBFGSB:
+ """L-BFGS-B optimizer via native backend."""
+
+ def test_finds_correct_minimum(self):
+ # Simple quadratic: f(x) = (x-2)^2
+ result = minimize_lbfgsb(
+ lambda x: float((x[0] - 2) ** 2),
+ x0=np.array([0.0]),
+ bounds=[(-10.0, 10.0)],
+ )
+ assert abs(result.x[0] - 2.0) < 0.01
+ assert result.fun < 0.01
+
+
+class TestOptimizerBrent:
+ """Brent scalar minimizer via native backend."""
+
+ def test_finds_correct_minimum(self):
+ # f(x) = (x - 3)^2
+ result = minimize_scalar_brent(
+ lambda x: (x - 3) ** 2,
+ bounds=(0.0, 10.0),
+ )
+ assert abs(result.x - 3.0) < 0.01
+ assert result.fun < 0.01
+
+ def test_converged_flag(self):
+ result = minimize_scalar_brent(
+ lambda x: (x - 5) ** 2,
+ bounds=(0.0, 10.0),
+ )
+ assert result.converged
+
+
diff --git a/tests/unit/test_formatters_edge.py b/tests/unit/test_formatters_edge.py
new file mode 100644
index 0000000..f52fd25
--- /dev/null
+++ b/tests/unit/test_formatters_edge.py
@@ -0,0 +1,230 @@
+"""Tests for formatter edge cases — scenario sample-size long format, cumulative recs, NaN filtering."""
+
+import math
+
+import pytest
+
+from mcpower.utils.formatters import _ResultFormatter, _is_nan
+
+
+_fmt = _ResultFormatter()
+
+
+def _make_scenario_sample_size_data(
+ target_tests=("x1", "x2"),
+ correction=None,
+ sample_sizes=(50, 100, 150),
+ optimistic_achieved=None,
+ realistic_achieved=None,
+ doomer_achieved=None,
+):
+ """Build a scenario sample_size result dict for formatting tests."""
+ if optimistic_achieved is None:
+ optimistic_achieved = {"x1": 50, "x2": 100}
+ if realistic_achieved is None:
+ realistic_achieved = {"x1": 100, "x2": 150}
+ if doomer_achieved is None:
+ doomer_achieved = {"x1": 0, "x2": 0} # Not achieved
+
+ def _make_scenario(achieved):
+ achieved_corr = {t: -1 for t in target_tests} if not correction else achieved
+ return {
+ "model": {
+ "target_tests": list(target_tests),
+ "correction": correction,
+ "sample_size_range": {"from_size": sample_sizes[0], "to_size": sample_sizes[-1]},
+ "target_power": 80.0,
+ },
+ "results": {
+ "first_achieved": achieved,
+ "first_achieved_corrected": achieved_corr,
+ "sample_sizes_tested": list(sample_sizes),
+ "powers_by_test": {
+ t: [30.0 + 25.0 * i for i in range(len(sample_sizes))]
+ for t in target_tests
+ },
+ "powers_by_test_corrected": (
+ {t: [25.0 + 25.0 * i for i in range(len(sample_sizes))] for t in target_tests}
+ if correction
+ else None
+ ),
+ },
+ }
+
+ return {
+ "analysis_type": "sample_size",
+ "scenarios": {
+ "optimistic": _make_scenario(optimistic_achieved),
+ "realistic": _make_scenario(realistic_achieved),
+ "doomer": _make_scenario(doomer_achieved),
+ },
+ "comparison": {},
+ }
+
+
+class TestScenarioSampleSizeLongFormat:
+ """Test _format_scenario_sample_size with summary='long'."""
+
+ def test_recommendations_present(self):
+ data = _make_scenario_sample_size_data()
+ output = _fmt.format("scenario_sample_size", data, "long")
+ assert "RECOMMENDATIONS" in output
+
+ def test_unachievable_tests_warning(self):
+ data = _make_scenario_sample_size_data(
+ doomer_achieved={"x1": 0, "x2": 0},
+ )
+ output = _fmt.format("scenario_sample_size", data, "long")
+ assert "Warning" in output or "may not achieve" in output
+
+ def test_realistic_recommendation_shown(self):
+ data = _make_scenario_sample_size_data(
+ realistic_achieved={"x1": 100, "x2": 150},
+ )
+ output = _fmt.format("scenario_sample_size", data, "long")
+ assert "150" in output # max N for realistic
+
+ def test_short_format_produces_table(self):
+ data = _make_scenario_sample_size_data()
+ output = _fmt.format("scenario_sample_size", data, "short")
+ assert "SCENARIO SUMMARY" in output
+
+ def test_with_correction(self):
+ data = _make_scenario_sample_size_data(correction="bonferroni")
+ output = _fmt.format("scenario_sample_size", data, "short")
+ assert "Opt(U)" in output or "Uncorrected" in output.lower() or "(U)" in output
+
+
+class TestCumulativeRecommendations:
+ """Test _format_cumulative_recommendations paths."""
+
+ def test_non_scenario_target_met(self):
+ data = {
+ "model": {
+ "target_tests": ["x1", "x2"],
+ "target_power": 80.0,
+ },
+ "results": {
+ "sample_sizes_tested": [50, 100, 150],
+ "powers_by_test": {
+ "x1": [60.0, 85.0, 95.0],
+ "x2": [70.0, 90.0, 98.0],
+ },
+ },
+ }
+ lines = _fmt._format_cumulative_recommendations(data, is_scenario=False)
+ joined = "\n".join(lines)
+ assert "N=" in joined # Found a sample size
+
+ def test_non_scenario_target_not_met(self):
+ data = {
+ "model": {
+ "target_tests": ["x1", "x2"],
+ "target_power": 80.0,
+ },
+ "results": {
+ "sample_sizes_tested": [50, 100],
+ "powers_by_test": {
+ "x1": [10.0, 20.0],
+ "x2": [15.0, 25.0],
+ },
+ },
+ }
+ lines = _fmt._format_cumulative_recommendations(data, is_scenario=False)
+ joined = "\n".join(lines)
+ assert ">100" in joined # Exceeded max tested
+
+ def test_scenario_recommendations(self):
+ data = _make_scenario_sample_size_data(
+ sample_sizes=(50, 100, 150, 200),
+ optimistic_achieved={"x1": 100, "x2": 150},
+ )
+ # Override powers so all > 80%
+ for scenario in data["scenarios"].values():
+ scenario["results"]["powers_by_test"] = {
+ "x1": [50.0, 85.0, 92.0, 98.0],
+ "x2": [40.0, 75.0, 88.0, 95.0],
+ }
+ lines = _fmt._format_cumulative_recommendations(data, is_scenario=True)
+ assert len(lines) > 0
+
+ def test_empty_scenarios(self):
+ data = {"scenarios": {}}
+ lines = _fmt._format_cumulative_recommendations(data, is_scenario=True)
+ assert lines == []
+
+ def test_no_results_key(self):
+ data = {}
+ lines = _fmt._format_cumulative_recommendations(data, is_scenario=False)
+ assert lines == []
+
+
+class TestNaNPowerFiltering:
+ """NaN power values in cumulative table should be filtered out."""
+
+ def test_nan_power_filtered_in_cumulative_sample_size_table(self):
+ lines = []
+ _fmt._add_cumulative_sample_size_table(
+ lines,
+ sample_sizes=[50, 100],
+ target_tests=["x1", "x2_nan"],
+ powers_by_test={
+ "x1": [50.0, 80.0],
+ "x2_nan": [float("nan"), float("nan")],
+ },
+ )
+ # Should still produce output for x1 (x2_nan filtered out)
+ output = "\n".join(lines)
+ assert "N=50" in output or "50" in output
+
+ def test_all_nan_produces_no_table(self):
+ lines = []
+ _fmt._add_cumulative_sample_size_table(
+ lines,
+ sample_sizes=[50],
+ target_tests=["x1"],
+ powers_by_test={"x1": [float("nan")]},
+ )
+ # All NaN → no valid tests → no table
+ assert len(lines) == 0
+
+
+class TestIsNan:
+ """Test _is_nan utility."""
+
+ def test_nan_float(self):
+ assert _is_nan(float("nan"))
+
+ def test_regular_float(self):
+ assert not _is_nan(42.0)
+
+ def test_non_float(self):
+ assert not _is_nan("nan")
+ assert not _is_nan(None)
+ assert not _is_nan(42)
+
+
+class TestExtractScenarioMeta:
+ """Test _extract_scenario_meta."""
+
+ def test_no_model_returns_none(self):
+ target_tests, correction = _fmt._extract_scenario_meta({"opt": {"results": {}}})
+ assert target_tests is None
+
+ def test_extracts_from_first_scenario(self):
+ scenarios = {
+ "optimistic": {
+ "model": {"target_tests": ["a", "b"], "correction": "holm"},
+ }
+ }
+ target_tests, correction = _fmt._extract_scenario_meta(scenarios)
+ assert target_tests == ["a", "b"]
+ assert correction == "holm"
+
+
+class TestFormatUnknownType:
+ """Unknown result type should raise."""
+
+ def test_unknown_result_type(self):
+ with pytest.raises(ValueError, match="Unknown result type"):
+ _fmt.format("nonexistent", {})
diff --git a/tests/unit/test_mixed_models_coverage.py b/tests/unit/test_mixed_models_coverage.py
new file mode 100644
index 0000000..bc99974
--- /dev/null
+++ b/tests/unit/test_mixed_models_coverage.py
@@ -0,0 +1,292 @@
+"""Tests for stats/mixed_models.py — statsmodels convergence, corrections, native wrappers.
+
+Uses pytest.mark.lme to skip when statsmodels is not installed.
+"""
+
+import warnings
+from unittest.mock import MagicMock, patch
+
+import numpy as np
+import pytest
+
+from mcpower.stats.mixed_models import (
+ _ensure_lme_crits,
+ _lme_analysis_wrapper,
+ _wrap_native_result,
+ reset_warm_start_cache,
+)
+
+pytestmark = pytest.mark.lme
+
+
+class TestWrapNativeResult:
+ """Test _wrap_native_result helper."""
+
+ def test_non_empty_non_verbose(self):
+ result = np.array([1.0, 0.0, 1.0])
+ wrapped = _wrap_native_result(result, verbose=False, solver_name="native_q1")
+ np.testing.assert_array_equal(wrapped, result)
+
+ def test_non_empty_verbose(self):
+ result = np.array([1.0, 0.0, 1.0])
+ wrapped = _wrap_native_result(result, verbose=True, solver_name="native_q1")
+ assert isinstance(wrapped, dict)
+ assert "results" in wrapped
+ assert "diagnostics" in wrapped
+ assert wrapped["diagnostics"]["solver"] == "native_q1"
+
+ def test_non_empty_verbose_with_extra_diag(self):
+ result = np.array([1.0])
+ wrapped = _wrap_native_result(
+ result, verbose=True, solver_name="native_general",
+ extra_diag={"q": 3},
+ )
+ assert wrapped["diagnostics"]["q"] == 3
+
+ def test_empty_non_verbose_returns_none(self):
+ result = np.array([])
+ assert _wrap_native_result(result, verbose=False, solver_name="native_q1") is None
+
+ def test_empty_verbose_returns_failure_dict(self):
+ result = np.array([])
+ wrapped = _wrap_native_result(result, verbose=True, solver_name="native_q1")
+ assert wrapped["results"] is None
+ assert "failure_reason" in wrapped
+ assert "empty result" in wrapped["failure_reason"]
+
+
+class TestEnsureLMECrits:
+ """Test _ensure_lme_crits computes when None."""
+
+ def test_computes_when_none(self):
+ chi2, z, crits = _ensure_lme_crits(
+ alpha=0.05, p=3, n_targets=2, correction_method=0,
+ chi2_crit=None, z_crit=None, correction_z_crits=None,
+ )
+ assert np.isfinite(chi2)
+ assert np.isfinite(z)
+ assert len(crits) == 2
+
+ def test_passthrough_when_provided(self):
+ chi2, z, crits = _ensure_lme_crits(
+ alpha=0.05, p=3, n_targets=2, correction_method=0,
+ chi2_crit=7.8, z_crit=1.96, correction_z_crits=np.array([1.96, 1.96]),
+ )
+ assert chi2 == 7.8
+ assert z == 1.96
+ assert len(crits) == 2
+
+
+class TestLMEAnalysisWrapperRouting:
+ """Test _lme_analysis_wrapper routes to correct backend."""
+
+ def test_unknown_backend_raises(self):
+ with pytest.raises(ValueError, match="Unknown backend"):
+ _lme_analysis_wrapper(
+ np.eye(10), np.ones(10), np.array([0, 1]),
+ np.zeros(10, dtype=np.int32),
+ correction_method=0, alpha=0.05, backend="nonexistent",
+ )
+
+
+class TestStatsmodelsConvergence:
+ """Test statsmodels fallback path with mocked MixedLM."""
+
+ def _make_mock_result(self, converged=True, params=None, pvalues=None, n_params=3):
+ """Create a mock MixedLM result."""
+ result = MagicMock()
+ result.converged = converged
+ result.params = params if params is not None else np.array([1.0, 0.5, 0.3])
+ result.pvalues = pvalues if pvalues is not None else np.array([0.01, 0.02, 0.04])
+ result.fe_params = result.params
+ result.bse = np.array([0.1, 0.1, 0.1])
+
+ # cov_re: random effects variance (needs .iloc[0, 0])
+ cov_re = MagicMock()
+ cov_re.iloc.__getitem__ = MagicMock(return_value=0.5)
+ result.cov_re = cov_re
+
+ result.scale = 1.0
+ result.llf = -50.0
+
+ # Make cov_params return a proper matrix
+ result.cov_params.return_value = np.eye(n_params) * 0.01
+
+ # model attribute
+ result.model = MagicMock()
+ result.model.exog = MagicMock()
+ result.model.exog.shape = (100, n_params)
+
+ return result
+
+ @patch("statsmodels.regression.mixed_linear_model.MixedLM")
+ def test_warm_start_retry_chain(self, mock_mixedlm_cls):
+ """First fit fails, cold start succeeds."""
+ from mcpower.stats.mixed_models import _lme_analysis_statsmodels, _lme_thread_local
+
+ _lme_thread_local.warm_start_params = np.array([1.0, 0.5, 0.3])
+
+ mock_model = MagicMock()
+ mock_mixedlm_cls.return_value = mock_model
+
+ good_result = self._make_mock_result()
+ mock_model.fit.side_effect = [
+ Exception("warm start diverged"),
+ good_result,
+ ]
+ mock_model.loglike.return_value = -50.0
+
+ result = _lme_analysis_statsmodels(
+ X_expanded=np.random.randn(100, 2),
+ y=np.random.randn(100),
+ target_indices=np.array([0, 1]),
+ cluster_ids=np.repeat(np.arange(10), 10),
+
+ correction_method=0,
+ alpha=0.05,
+ )
+ assert result is not None
+
+ @patch("statsmodels.regression.mixed_linear_model.MixedLM")
+ def test_all_attempts_fail_returns_none(self, mock_mixedlm_cls):
+ from mcpower.stats.mixed_models import _lme_analysis_statsmodels, _lme_thread_local
+
+ _lme_thread_local.warm_start_params = None
+
+ mock_model = MagicMock()
+ mock_mixedlm_cls.return_value = mock_model
+ mock_model.fit.side_effect = Exception("always fails")
+
+ result = _lme_analysis_statsmodels(
+ X_expanded=np.random.randn(100, 2),
+ y=np.random.randn(100),
+ target_indices=np.array([0, 1]),
+ cluster_ids=np.repeat(np.arange(10), 10),
+
+ correction_method=0,
+ alpha=0.05,
+ )
+ assert result is None
+
+ @patch("statsmodels.regression.mixed_linear_model.MixedLM")
+ def test_all_attempts_fail_verbose_returns_dict(self, mock_mixedlm_cls):
+ from mcpower.stats.mixed_models import _lme_analysis_statsmodels, _lme_thread_local
+
+ _lme_thread_local.warm_start_params = None
+
+ mock_model = MagicMock()
+ mock_mixedlm_cls.return_value = mock_model
+ mock_model.fit.side_effect = Exception("always fails")
+
+ result = _lme_analysis_statsmodels(
+ X_expanded=np.random.randn(100, 2),
+ y=np.random.randn(100),
+ target_indices=np.array([0, 1]),
+ cluster_ids=np.repeat(np.arange(10), 10),
+
+ correction_method=0,
+ alpha=0.05,
+ verbose=True,
+ )
+ assert isinstance(result, dict)
+ assert result["results"] is None
+ assert "failure_reason" in result
+
+ @patch("statsmodels.regression.mixed_linear_model.MixedLM")
+ def test_not_converged_returns_none(self, mock_mixedlm_cls):
+ """When result.converged is False for all attempts."""
+ from mcpower.stats.mixed_models import _lme_analysis_statsmodels, _lme_thread_local
+
+ _lme_thread_local.warm_start_params = None
+
+ mock_model = MagicMock()
+ mock_mixedlm_cls.return_value = mock_model
+
+ bad_result = self._make_mock_result(converged=False)
+ mock_model.fit.return_value = bad_result
+
+ result = _lme_analysis_statsmodels(
+ X_expanded=np.random.randn(100, 2),
+ y=np.random.randn(100),
+ target_indices=np.array([0, 1]),
+ cluster_ids=np.repeat(np.arange(10), 10),
+
+ correction_method=0,
+ alpha=0.05,
+ )
+ assert result is None
+
+
+class TestCorrections:
+ """Test statsmodels FDR, Holm, Bonferroni, no-correction paths."""
+
+ def _make_mock_result(self):
+ result = MagicMock()
+ result.converged = True
+ result.params = np.array([1.0, 0.5, 0.3])
+ result.pvalues = np.array([0.001, 0.02, 0.04])
+ result.fe_params = result.params
+ result.bse = np.array([0.1, 0.1, 0.1])
+ result.scale = 1.0
+ result.llf = -50.0
+ result.model = MagicMock()
+ result.model.exog = MagicMock()
+ result.model.exog.shape = (100, 3)
+
+ cov_re_mock = MagicMock()
+ cov_re_mock.iloc.__getitem__ = MagicMock(return_value=0.5)
+ result.cov_re = cov_re_mock
+ result.cov_params.return_value = np.eye(3) * 0.01
+
+ return result
+
+ def _run_with_correction(self, correction_method):
+ from mcpower.stats.mixed_models import _lme_analysis_statsmodels, _lme_thread_local
+
+ _lme_thread_local.warm_start_params = None
+
+ mock_result = self._make_mock_result()
+
+ with patch("statsmodels.regression.mixed_linear_model.MixedLM") as mock_cls:
+ mock_model = MagicMock()
+ mock_cls.return_value = mock_model
+ mock_model.fit.return_value = mock_result
+ mock_model.loglike.return_value = -50.0
+
+ out = _lme_analysis_statsmodels(
+ X_expanded=np.random.randn(100, 2),
+ y=np.random.randn(100),
+ target_indices=np.array([0, 1]),
+ cluster_ids=np.repeat(np.arange(10), 10),
+
+ correction_method=correction_method,
+ alpha=0.05,
+ )
+ return out
+
+ def test_no_correction(self):
+ result = self._run_with_correction(0)
+ assert result is not None
+
+ def test_bonferroni(self):
+ result = self._run_with_correction(1)
+ assert result is not None
+
+ def test_fdr(self):
+ result = self._run_with_correction(2)
+ assert result is not None
+
+ def test_holm(self):
+ result = self._run_with_correction(3)
+ assert result is not None
+
+
+class TestResetWarmStartCache:
+ """Test reset_warm_start_cache."""
+
+ def test_clears_params(self):
+ from mcpower.stats.mixed_models import _lme_thread_local
+
+ _lme_thread_local.warm_start_params = np.array([1.0])
+ reset_warm_start_cache()
+ assert _lme_thread_local.warm_start_params is None
diff --git a/tests/unit/test_model_coverage.py b/tests/unit/test_model_coverage.py
new file mode 100644
index 0000000..23d47c9
--- /dev/null
+++ b/tests/unit/test_model_coverage.py
@@ -0,0 +1,117 @@
+"""Tests for model.py — parallel fallback, Tukey validation, NaN under Tukey correction."""
+
+import warnings
+from unittest.mock import MagicMock, patch
+
+import numpy as np
+import pytest
+
+from mcpower import MCPower
+
+
+class TestTukeyWithoutPosthoc:
+ """Tukey correction without posthoc specs should raise ValueError."""
+
+ def test_tukey_without_posthoc_raises(self):
+ model = MCPower("y = x1 + x2")
+ model.set_effects("x1=0.5, x2=0.3")
+
+ with pytest.raises(ValueError, match="Tukey correction requires"):
+ model.find_power(
+ sample_size=100,
+ correction="tukey",
+ print_results=False,
+ )
+
+
+class TestTukeyNaNification:
+ """Non-posthoc tests should be NaN-ified under Tukey correction."""
+
+ def test_non_posthoc_tests_nan_under_tukey(self):
+ model = MCPower("y = group + x1")
+ model.set_variable_type("group=(factor,3)")
+ model.set_effects("group[2]=0.5, group[3]=0.4, x1=0.3")
+ model.n_simulations = 50
+ model.seed = 42
+
+ result = model.find_sample_size(
+ target_test="all, all-posthoc",
+ correction="tukey",
+ from_size=30,
+ to_size=60,
+ by=30,
+ print_results=False,
+ return_results=True,
+ )
+
+ assert result is not None
+ results = result["results"]
+ corrected = results.get("powers_by_test_corrected", {})
+
+ # Post-hoc comparisons should have real power values
+ # Non-posthoc tests (like "x1", "group[2]", "group[3]", "overall")
+ # should have NaN values
+ posthoc_labels = {s.label for s in model._posthoc_specs}
+ for test_name, powers in corrected.items():
+ if test_name not in posthoc_labels:
+ assert all(isinstance(v, float) and np.isnan(v) for v in powers), \
+ f"Expected NaN for non-posthoc test '{test_name}', got {powers}"
+
+ # first_achieved_corrected for non-posthoc should be -1
+ for test_name, n in results.get("first_achieved_corrected", {}).items():
+ if test_name not in posthoc_labels:
+ assert n == -1, f"Expected -1 for '{test_name}', got {n}"
+
+
+class TestParallelFallback:
+ """Parallel execution falls back to sequential on exception."""
+
+ def test_parallel_exception_falls_back(self, capsys):
+ model = MCPower("y = x1 + x2")
+ model.set_effects("x1=0.5, x2=0.3")
+ model.parallel = True
+ model.n_simulations = 50
+ model.seed = 42
+
+ # Parallel is imported inside the function via `from joblib import Parallel`,
+ # so we patch it at the joblib module level.
+ with patch("joblib.Parallel", side_effect=RuntimeError("joblib broken")):
+ # Should still complete via sequential fallback
+ result = model.find_sample_size(
+ from_size=30,
+ to_size=60,
+ by=30,
+ print_results=False,
+ return_results=True,
+ )
+ assert result is not None
+ captured = capsys.readouterr()
+ assert "Falling back to sequential" in captured.out
+
+
+class TestIsParallelEffective:
+ """Test _is_parallel_effective resolution."""
+
+ def test_true_always_parallel(self):
+ model = MCPower("y = x1 + x2")
+ model.parallel = True
+ assert model._is_parallel_effective() is True
+
+ def test_false_never_parallel(self):
+ model = MCPower("y = x1 + x2")
+ model.parallel = False
+ assert model._is_parallel_effective() is False
+
+ def test_mixedmodels_with_clusters(self):
+ model = MCPower("y ~ x1 + (1|school)")
+ model.set_cluster("school", ICC=0.2, n_clusters=20)
+ model.set_effects("x1=0.5")
+ model._apply() # cluster_specs are deferred until apply()
+ model.parallel = "mixedmodels"
+ assert model._is_parallel_effective() is True
+
+ def test_mixedmodels_without_clusters(self):
+ model = MCPower("y = x1 + x2")
+ model.set_effects("x1=0.5, x2=0.3")
+ model.parallel = "mixedmodels"
+ assert model._is_parallel_effective() is False
diff --git a/tests/unit/test_native_backend.py b/tests/unit/test_native_backend.py
new file mode 100644
index 0000000..93b4a3e
--- /dev/null
+++ b/tests/unit/test_native_backend.py
@@ -0,0 +1,60 @@
+"""Tests for mcpower.backends.native — import fallback and _prep utility."""
+
+import numpy as np
+import pytest
+from unittest.mock import patch, MagicMock
+
+from mcpower.backends.native import _prep
+
+
+class TestPrep:
+ """Test _prep array coercion for C++ interop."""
+
+ def test_contiguous_passthrough(self):
+ arr = np.array([1.0, 2.0, 3.0], dtype=np.float64)
+ result = _prep(arr)
+ assert result.flags["C_CONTIGUOUS"]
+ assert result.dtype == np.float64
+
+ def test_non_contiguous_becomes_contiguous(self):
+ arr = np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float64)
+ col = arr[:, 1] # non-contiguous column slice
+ assert not col.flags["C_CONTIGUOUS"]
+ result = _prep(col)
+ assert result.flags["C_CONTIGUOUS"]
+ np.testing.assert_array_equal(result, [2.0, 4.0])
+
+ def test_dtype_conversion_float32_to_float64(self):
+ arr = np.array([1.0, 2.0], dtype=np.float32)
+ result = _prep(arr, np.float64)
+ assert result.dtype == np.float64
+
+ def test_dtype_conversion_int64_to_int32(self):
+ arr = np.array([0, 1, 2], dtype=np.int64)
+ result = _prep(arr, np.int32)
+ assert result.dtype == np.int32
+ np.testing.assert_array_equal(result, [0, 1, 2])
+
+ def test_2d_array(self):
+ arr = np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float64, order="F")
+ assert not arr.flags["C_CONTIGUOUS"]
+ result = _prep(arr)
+ assert result.flags["C_CONTIGUOUS"]
+ assert result.dtype == np.float64
+
+
+class TestNativeBackendImport:
+ """Test NativeBackend init when C++ extension is unavailable."""
+
+ def test_init_raises_when_unavailable(self):
+ """NativeBackend() should raise ImportError when _NATIVE_AVAILABLE=False."""
+ with patch("mcpower.backends.native._NATIVE_AVAILABLE", False):
+ from mcpower.backends.native import NativeBackend
+ with pytest.raises(ImportError, match="Native C\\+\\+ backend not available"):
+ NativeBackend()
+
+ def test_is_native_available_reflects_module_state(self):
+ from mcpower.backends.native import is_native_available
+ # Just verify it returns a bool
+ result = is_native_available()
+ assert isinstance(result, bool)
diff --git a/tests/unit/test_ols_corrections.py b/tests/unit/test_ols_corrections.py
new file mode 100644
index 0000000..8c6728a
--- /dev/null
+++ b/tests/unit/test_ols_corrections.py
@@ -0,0 +1,251 @@
+"""Tests for OLS post-hoc contrast corrections and edge cases."""
+
+from dataclasses import dataclass
+from typing import Optional
+
+import numpy as np
+import pytest
+
+from mcpower.stats.ols import compute_posthoc_contrasts
+
+
+@dataclass
+class _PostHocSpec:
+ """Minimal PostHocSpec stub for tests."""
+ factor_name: str
+ col_idx_a: Optional[int]
+ col_idx_b: Optional[int]
+ label: str = ""
+ level_a: str = ""
+ level_b: str = ""
+ n_levels: int = 3
+
+
+def _make_ols_data(n=100, p=3, seed=42):
+ """Generate simple OLS data: X, y, and target_indices."""
+ rng = np.random.RandomState(seed)
+ X = rng.randn(n, p)
+ beta = np.array([0.5, 0.3, -0.2])[:p]
+ y = X @ beta + rng.randn(n)
+ return X, y
+
+
+class TestDegenerateDesign:
+ """When dof <= 0, posthoc should return zeros."""
+
+ def test_dof_zero_returns_zeros(self):
+ # n = p+1 → dof = 0
+ n, p = 4, 3
+ rng = np.random.RandomState(42)
+ X = rng.randn(n, p)
+ y = rng.randn(n)
+ specs = [_PostHocSpec("grp", 0, 1)]
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 2.0, {}, target_indices=np.array([0, 1, 2]),
+ )
+ assert uncorr.shape == (1,)
+ assert not uncorr[0]
+ assert not corr[0]
+ assert override is None
+
+ def test_singular_contrast_variance_stays_zero(self):
+ """When both col_idx_a and col_idx_b are None, t_abs stays 0."""
+ X, y = _make_ols_data()
+ specs = [_PostHocSpec("grp", None, None)]
+
+ uncorr, corr, _ = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 2.0, {},
+ )
+ assert not uncorr[0]
+ assert not corr[0]
+
+
+class TestCombinedFDR:
+ """FDR (correction_method=2) step-up across regular+posthoc t-stats."""
+
+ def test_fdr_combined_ranking(self):
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [
+ _PostHocSpec("grp", 0, 1),
+ _PostHocSpec("grp", 0, 2),
+ ]
+ target_indices = np.array([0, 1, 2])
+ # Create combined crits of length n_regular + n_posthoc = 5
+ # Use very lenient crits so everything passes
+ combined_crits = np.full(5, 0.01)
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 0.01, {},
+ target_indices=target_indices,
+ correction_method=2,
+ correction_t_crits_combined=combined_crits,
+ )
+ assert override is not None
+ assert len(override) == 3 # n_regular
+ assert len(corr) == 2 # n_posthoc
+
+ def test_fdr_no_significant(self):
+ """With very strict crits, nothing should be significant."""
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [_PostHocSpec("grp", 0, 1)]
+ target_indices = np.array([0, 1, 2])
+ # Very strict thresholds
+ combined_crits = np.full(4, 100.0)
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 100.0, {},
+ target_indices=target_indices,
+ correction_method=2,
+ correction_t_crits_combined=combined_crits,
+ )
+ assert not np.any(corr)
+ assert override is not None
+ assert not np.any(override)
+
+
+class TestCombinedHolm:
+ """Holm (correction_method=3) step-down with early termination."""
+
+ def test_holm_combined_ranking(self):
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [_PostHocSpec("grp", 0, 1)]
+ target_indices = np.array([0, 1, 2])
+ combined_crits = np.full(4, 0.01) # Very lenient
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 0.01, {},
+ target_indices=target_indices,
+ correction_method=3,
+ correction_t_crits_combined=combined_crits,
+ )
+ assert override is not None
+ assert len(override) == 3
+
+ def test_holm_early_termination(self):
+ """If the most significant test doesn't pass, none should."""
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [_PostHocSpec("grp", 0, 1)]
+ target_indices = np.array([0, 1, 2])
+ combined_crits = np.full(4, 1000.0) # Impossible threshold
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 1000.0, {},
+ target_indices=target_indices,
+ correction_method=3,
+ correction_t_crits_combined=combined_crits,
+ )
+ assert not np.any(corr)
+
+
+class TestFallbackPaths:
+ """Fallback when correction_t_crits_combined is None or wrong length."""
+
+ def test_combined_crits_none_fallback(self):
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [_PostHocSpec("grp", 0, 1)]
+ target_indices = np.array([0, 1, 2])
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 2.0, {},
+ target_indices=target_indices,
+ correction_method=2,
+ correction_t_crits_combined=None,
+ )
+ # Fallback: corrected = uncorrected copy, no override
+ np.testing.assert_array_equal(corr, uncorr)
+ assert override is None
+
+ def test_combined_crits_wrong_length_fallback(self):
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [_PostHocSpec("grp", 0, 1)]
+ target_indices = np.array([0, 1, 2])
+ # Wrong length: should be 4 (3 regular + 1 posthoc)
+ wrong_crits = np.full(2, 2.0)
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 2.0, {},
+ target_indices=target_indices,
+ correction_method=2,
+ correction_t_crits_combined=wrong_crits,
+ )
+ np.testing.assert_array_equal(corr, uncorr)
+ assert override is None
+
+
+class TestTukeyMethod:
+ """Tukey post-hoc method path."""
+
+ def test_tukey_uses_factor_crit(self):
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [_PostHocSpec("grp", 0, 1, n_levels=3)]
+ tukey_crits = {"grp": 0.01} # Very lenient
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "tukey", 2.0, tukey_crits,
+ )
+ # Tukey correction: uncorrected == corrected
+ np.testing.assert_array_equal(uncorr, corr)
+ assert override is None
+
+ def test_tukey_missing_factor_uses_inf(self):
+ """When factor not in tukey_crits, inf is used → not significant."""
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [_PostHocSpec("missing_factor", 0, 1)]
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "tukey", 2.0, {},
+ )
+ assert not uncorr[0]
+ assert not corr[0]
+
+
+class TestBonferroniPosthoc:
+ """Bonferroni correction for posthoc (correction_method=1)."""
+
+ def test_bonferroni_uses_combined_first_crit(self):
+ X, y = _make_ols_data(n=200, p=3, seed=10)
+ specs = [_PostHocSpec("grp", 0, 1)]
+ target_indices = np.array([0, 1, 2])
+ combined_crits = np.full(4, 0.01) # Very lenient
+
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 0.01, {},
+ target_indices=target_indices,
+ correction_method=1,
+ correction_t_crits_combined=combined_crits,
+ )
+ assert override is None # Bonferroni doesn't produce override
+
+
+class TestEmptySpecs:
+ """Empty posthoc specs return empty arrays."""
+
+ def test_no_specs(self):
+ X, y = _make_ols_data()
+ uncorr, corr, override = compute_posthoc_contrasts(
+ X, y, [], "t-test", 2.0, {},
+ )
+ assert len(uncorr) == 0
+ assert len(corr) == 0
+ assert override is None
+
+
+class TestSingleColumnContrasts:
+ """Contrasts where one side is the reference level (None)."""
+
+ def test_col_idx_a_none(self):
+ X, y = _make_ols_data(n=200)
+ specs = [_PostHocSpec("grp", None, 1)]
+ uncorr, corr, _ = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 2.0, {},
+ )
+ assert uncorr.shape == (1,)
+
+ def test_col_idx_b_none(self):
+ X, y = _make_ols_data(n=200)
+ specs = [_PostHocSpec("grp", 0, None)]
+ uncorr, corr, _ = compute_posthoc_contrasts(
+ X, y, specs, "t-test", 2.0, {},
+ )
+ assert uncorr.shape == (1,)
diff --git a/tests/unit/test_parsers_errors.py b/tests/unit/test_parsers_errors.py
new file mode 100644
index 0000000..b3c8c01
--- /dev/null
+++ b/tests/unit/test_parsers_errors.py
@@ -0,0 +1,168 @@
+"""Tests for parser error paths and edge cases."""
+
+import pytest
+
+from mcpower.utils.parsers import _AssignmentParser, _parse_equation
+
+
+_parser = _AssignmentParser()
+
+
+class TestAssignmentParserErrors:
+ """Error paths in _AssignmentParser._parse."""
+
+ def test_missing_equals_sign(self):
+ parsed, errors = _parser._parse("x1 0.5", "effect", ["x1"])
+ assert len(errors) == 1
+ assert "Invalid format" in errors[0]
+
+ def test_unknown_parse_type(self):
+ parsed, errors = _parser._parse("x1=0.5", "unknown_type", ["x1"])
+ assert len(errors) == 1
+ assert "Unknown parse type" in errors[0]
+
+ def test_unavailable_variable(self):
+ parsed, errors = _parser._parse("x_missing=0.5", "effect", ["x1", "x2"])
+ assert len(errors) == 1
+ assert "not found" in errors[0]
+ assert "x_missing" in errors[0]
+
+ def test_invalid_effect_value(self):
+ parsed, errors = _parser._parse("x1=abc", "effect", ["x1"])
+ assert len(errors) == 1
+ assert "Invalid effect size" in errors[0]
+
+ def test_multiple_errors(self):
+ parsed, errors = _parser._parse("x_bad=abc, x_also_bad=xyz", "effect", ["x1"])
+ assert len(errors) == 2
+
+
+class TestCorrelationParserErrors:
+ """Error paths for correlation parsing."""
+
+ def test_invalid_correlation_format(self):
+ parsed, errors = _parser._parse("x1_x2=0.5", "correlation", ["x1", "x2"])
+ assert len(errors) == 1
+ assert "Invalid format" in errors[0] or "Invalid correlation" in errors[0]
+
+ def test_correlation_var_not_found(self):
+ parsed, errors = _parser._parse("corr(x1, x_missing)=0.5", "correlation", ["x1", "x2"])
+ assert len(errors) == 1
+ assert "not found" in errors[0]
+
+ def test_self_correlation(self):
+ parsed, errors = _parser._parse("corr(x1, x1)=0.5", "correlation", ["x1", "x2"])
+ assert len(errors) == 1
+ assert "Cannot correlate variable with itself" in errors[0]
+
+ def test_correlation_value_out_of_range(self):
+ parsed, errors = _parser._parse("corr(x1, x2)=1.5", "correlation", ["x1", "x2"])
+ assert len(errors) == 1
+ assert "between -1 and 1" in errors[0]
+
+ def test_invalid_correlation_value(self):
+ parsed, errors = _parser._parse("corr(x1, x2)=abc", "correlation", ["x1", "x2"])
+ assert len(errors) == 1
+ assert "Invalid correlation value" in errors[0]
+
+
+class TestVariableTypeErrors:
+ """Error paths for variable type parsing."""
+
+ def test_unsupported_type(self):
+ parsed, errors = _parser._parse("x1=crazy_type", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "Unsupported type" in errors[0]
+
+ def test_binary_proportion_out_of_range(self):
+ parsed, errors = _parser._parse("x1=(binary,1.5)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "between 0 and 1" in errors[0]
+
+ def test_binary_non_numeric_proportion(self):
+ parsed, errors = _parser._parse("x1=(binary,abc)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "Invalid proportion" in errors[0]
+
+ def test_binary_wrong_param_count(self):
+ parsed, errors = _parser._parse("x1=(binary,0.3,0.4)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "exactly 2 values" in errors[0]
+
+ def test_factor_less_than_2_levels(self):
+ parsed, errors = _parser._parse("x1=(factor,1)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "at least 2 levels" in errors[0]
+
+ def test_factor_more_than_20_levels(self):
+ parsed, errors = _parser._parse("x1=(factor,21)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "more than 20 levels" in errors[0]
+
+ def test_factor_non_integer_levels(self):
+ parsed, errors = _parser._parse("x1=(factor,abc)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "Must be integer" in errors[0]
+
+ def test_factor_proportions_more_than_20(self):
+ props = ",".join(["0.04"] * 21)
+ parsed, errors = _parser._parse(f"x1=(factor,{props})", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "more than 20 levels" in errors[0]
+
+ def test_factor_zero_proportion(self):
+ parsed, errors = _parser._parse("x1=(factor,0.5,0.0,0.5)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "positive" in errors[0]
+
+ def test_factor_non_numeric_proportions(self):
+ parsed, errors = _parser._parse("x1=(factor,abc,def)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "numeric" in errors[0]
+
+ def test_tuple_no_comma(self):
+ parsed, errors = _parser._parse("x1=(binary)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "Invalid tuple format" in errors[0]
+
+ def test_tuple_unsupported_type_in_tuple(self):
+ parsed, errors = _parser._parse("x1=(normal,0.5)", "variable_type", ["x1"])
+ assert len(errors) == 1
+ assert "only supported for binary and factor" in errors[0]
+
+
+class TestEquationParsing:
+ """Edge cases in _parse_equation."""
+
+ def test_nested_random_effects(self):
+ dep, formula, ranefs = _parse_equation("y ~ x1 + (1|A/B)")
+ assert dep == "y"
+ assert len(ranefs) == 2
+ group_vars = {r["grouping_var"] for r in ranefs}
+ assert "A" in group_vars
+ assert "A:B" in group_vars
+
+ def test_duplicate_grouping_var_raises(self):
+ with pytest.raises(ValueError, match="Duplicate random effect grouping variable"):
+ _parse_equation("y ~ x1 + (1|school) + (1|school)")
+
+ def test_random_slopes(self):
+ dep, formula, ranefs = _parse_equation("y ~ x1 + (1 + x1|school)")
+ assert len(ranefs) == 1
+ assert ranefs[0]["type"] == "random_slope"
+ assert ranefs[0]["slope_vars"] == ["x1"]
+ assert ranefs[0]["grouping_var"] == "school"
+
+ def test_random_slope_duplicate_grouping_raises(self):
+ with pytest.raises(ValueError, match="Duplicate"):
+ _parse_equation("y ~ x1 + (1|school) + (1 + x1|school)")
+
+ def test_no_separator_uses_default_dep(self):
+ dep, formula, ranefs = _parse_equation("x1+x2")
+ assert dep == "explained_variable"
+ assert "x1" in formula
+ assert "x2" in formula
+
+ def test_nested_duplicate_parent_raises(self):
+ with pytest.raises(ValueError, match="Duplicate"):
+ _parse_equation("y ~ (1|A) + (1|A/B)")
diff --git a/tests/unit/test_progress.py b/tests/unit/test_progress.py
index 329acca..768420a 100644
--- a/tests/unit/test_progress.py
+++ b/tests/unit/test_progress.py
@@ -127,12 +127,6 @@ def test_completion_newline(self):
class TestTqdmReporter:
"""Test TqdmReporter with mock tqdm."""
- def test_tqdm_missing_raises(self):
- reporter = TqdmReporter()
- with patch.dict("sys.modules", {"tqdm": None}):
- with pytest.raises(ImportError, match="tqdm"):
- reporter(0, 100)
-
def test_tqdm_basic_flow(self):
mock_bar = MagicMock()
mock_bar.n = 0
@@ -154,6 +148,51 @@ def test_tqdm_basic_flow(self):
reporter(100, 100) # closes
mock_bar.close.assert_called_once()
+ def test_tqdm_successive_sessions(self):
+ """After close, a new session creates a fresh bar."""
+ mock_bar = MagicMock()
+ mock_bar.n = 0
+ mock_tqdm_cls = MagicMock(return_value=mock_bar)
+ mock_tqdm_module = MagicMock()
+ mock_tqdm_module.tqdm = mock_tqdm_cls
+
+ reporter = TqdmReporter()
+
+ with patch.dict("sys.modules", {"tqdm": mock_tqdm_module}):
+ # First session
+ reporter(0, 50)
+ mock_bar.n = 0
+ reporter(50, 50)
+ mock_bar.close.assert_called_once()
+ assert reporter._bar is None
+
+ # Second session — should create a new bar
+ mock_tqdm_cls.reset_mock()
+ mock_bar2 = MagicMock()
+ mock_bar2.n = 0
+ mock_tqdm_cls.return_value = mock_bar2
+
+ reporter(0, 200)
+ assert mock_tqdm_cls.call_count == 1
+ mock_tqdm_cls.assert_called_with(total=200, unit="sim")
+
+ def test_tqdm_no_negative_delta(self):
+ """When current <= bar.n, update should not be called with negative delta."""
+ mock_bar = MagicMock()
+ mock_bar.n = 50
+ mock_tqdm_cls = MagicMock(return_value=mock_bar)
+ mock_tqdm_module = MagicMock()
+ mock_tqdm_module.tqdm = mock_tqdm_cls
+
+ reporter = TqdmReporter()
+
+ with patch.dict("sys.modules", {"tqdm": mock_tqdm_module}):
+ reporter(0, 100) # creates bar
+ mock_bar.n = 50
+ reporter(30, 100) # current < bar.n
+ # update should NOT have been called (delta = 30 - 50 = -20, not > 0)
+ mock_bar.update.assert_not_called()
+
class TestComputeTotalSimulations:
"""Test compute_total_simulations helper."""
diff --git a/tests/unit/test_results.py b/tests/unit/test_results.py
new file mode 100644
index 0000000..31c3082
--- /dev/null
+++ b/tests/unit/test_results.py
@@ -0,0 +1,138 @@
+"""Unit tests for mcpower.core.results — ResultsProcessor and builder functions."""
+
+import numpy as np
+import pytest
+
+from mcpower.core.results import ResultsProcessor, build_power_result, build_sample_size_result
+
+
+class TestCalculatePowers:
+ """Tests for ResultsProcessor.calculate_powers."""
+
+ def test_basic_two_tests(self):
+ """Power calculation with two tests (overall + one predictor)."""
+ proc = ResultsProcessor(target_power=80.0)
+ # 10 simulations, 2 columns: [overall, x1]
+ # overall: 8/10 sig, x1: 6/10 sig
+ results = [np.array([True, True])] * 6 + [
+ np.array([True, False]),
+ np.array([True, False]),
+ np.array([False, False]),
+ np.array([False, False]),
+ ]
+ corrected = results # same for this test
+
+ out = proc.calculate_powers(results, corrected, ["overall", "x1"])
+
+ assert out["individual_powers"]["overall"] == pytest.approx(80.0)
+ assert out["individual_powers"]["x1"] == pytest.approx(60.0)
+ assert out["n_simulations_used"] == 10
+
+ def test_all_significant(self):
+ proc = ResultsProcessor()
+ results = [np.array([True, True])] * 5
+ out = proc.calculate_powers(results, results, ["overall", "x1"])
+ assert out["individual_powers"]["overall"] == pytest.approx(100.0)
+ assert out["individual_powers"]["x1"] == pytest.approx(100.0)
+
+ def test_none_significant(self):
+ proc = ResultsProcessor()
+ results = [np.array([False, False])] * 5
+ out = proc.calculate_powers(results, results, ["overall", "x1"])
+ assert out["individual_powers"]["overall"] == pytest.approx(0.0)
+ assert out["individual_powers"]["x1"] == pytest.approx(0.0)
+
+ def test_combined_probabilities(self):
+ proc = ResultsProcessor()
+ # 4 sims, 2 tests: exactly 0, 1, 2 significant
+ results = [
+ np.array([False, False]), # 0 sig
+ np.array([True, False]), # 1 sig
+ np.array([False, True]), # 1 sig
+ np.array([True, True]), # 2 sig
+ ]
+ out = proc.calculate_powers(results, results, ["overall", "x1"])
+ combined = out["combined_probabilities"]
+ assert combined["exactly_0_significant"] == pytest.approx(25.0)
+ assert combined["exactly_1_significant"] == pytest.approx(50.0)
+ assert combined["exactly_2_significant"] == pytest.approx(25.0)
+
+ def test_cumulative_probabilities(self):
+ proc = ResultsProcessor()
+ results = [
+ np.array([False, False]),
+ np.array([True, True]),
+ np.array([True, True]),
+ np.array([True, True]),
+ ]
+ out = proc.calculate_powers(results, results, ["overall", "x1"])
+ cumulative = out["cumulative_probabilities"]
+ assert cumulative["at_least_0_significant"] == pytest.approx(100.0)
+ assert cumulative["at_least_2_significant"] == pytest.approx(75.0)
+
+
+class TestBuildPowerResult:
+ """Tests for build_power_result."""
+
+ def test_basic_structure(self):
+ power_results = {
+ "individual_powers": {"overall": 80.0},
+ "n_simulations_used": 1000,
+ }
+ result = build_power_result(
+ model_type="OLS",
+ target_tests=["overall"],
+ formula_to_test=None,
+ equation="y = x1",
+ sample_size=100,
+ alpha=0.05,
+ n_simulations=1000,
+ correction=None,
+ target_power=80.0,
+ parallel=False,
+ power_results=power_results,
+ )
+ assert result["model"]["model_type"] == "OLS"
+ assert result["model"]["sample_size"] == 100
+ assert result["model"]["alpha"] == 0.05
+ assert result["results"] is power_results
+
+
+class TestBuildSampleSizeResult:
+ """Tests for build_sample_size_result."""
+
+ def test_basic_structure(self):
+ analysis_results = {"sample_sizes_tested": [50, 100]}
+ result = build_sample_size_result(
+ model_type="OLS",
+ target_tests=["overall"],
+ formula_to_test=None,
+ equation="y = x1",
+ sample_sizes=[50, 100],
+ alpha=0.05,
+ n_simulations=1000,
+ correction=None,
+ target_power=80.0,
+ parallel=False,
+ analysis_results=analysis_results,
+ )
+ assert result["model"]["sample_size_range"]["from_size"] == 50
+ assert result["model"]["sample_size_range"]["to_size"] == 100
+ assert result["model"]["sample_size_range"]["by"] == 50
+ assert result["results"] is analysis_results
+
+ def test_single_sample_size(self):
+ result = build_sample_size_result(
+ model_type="OLS",
+ target_tests=["overall"],
+ formula_to_test=None,
+ equation="y = x1",
+ sample_sizes=[100],
+ alpha=0.05,
+ n_simulations=1000,
+ correction=None,
+ target_power=80.0,
+ parallel=False,
+ analysis_results={},
+ )
+ assert result["model"]["sample_size_range"]["by"] == 1
diff --git a/tests/unit/test_scenarios_coverage.py b/tests/unit/test_scenarios_coverage.py
new file mode 100644
index 0000000..ea0d4f2
--- /dev/null
+++ b/tests/unit/test_scenarios_coverage.py
@@ -0,0 +1,218 @@
+"""Tests for scenario analysis — plot creation, correlation matrix repair, LME perturbations."""
+
+from unittest.mock import MagicMock, patch
+
+import numpy as np
+import pytest
+
+from mcpower.core.scenarios import (
+ ScenarioRunner,
+ apply_lme_perturbations,
+ apply_per_simulation_perturbations,
+)
+
+
+class TestCorrelationMatrixRepair:
+ """Spectral clipping when noise creates negative eigenvalues."""
+
+ def test_negative_eigenvalue_repaired(self):
+ """After heavy noise, result should be positive semi-definite with unit diagonal."""
+ # Create a 3x3 identity correlation matrix
+ corr = np.eye(3)
+ var_types = np.zeros(3, dtype=np.int64) # all normal
+
+ config = {
+ "correlation_noise_sd": 2.0, # Very heavy noise → guaranteed negative eigenvalues
+ "distribution_change_prob": 0.0,
+ "new_distributions": [],
+ }
+
+ perturbed_corr, _ = apply_per_simulation_perturbations(corr, var_types, config, sim_seed=42)
+
+ # Eigenvalues should all be >= 0
+ eigvals = np.linalg.eigvalsh(perturbed_corr)
+ assert np.all(eigvals >= -1e-10)
+
+ # Diagonal should be 1.0
+ np.testing.assert_allclose(np.diag(perturbed_corr), 1.0, atol=1e-10)
+
+ # Should be symmetric
+ np.testing.assert_allclose(perturbed_corr, perturbed_corr.T, atol=1e-10)
+
+ def test_no_repair_needed_when_no_noise(self):
+ corr = np.array([[1.0, 0.3], [0.3, 1.0]])
+ var_types = np.zeros(2, dtype=np.int64)
+
+ config = {
+ "correlation_noise_sd": 0.0,
+ "distribution_change_prob": 0.0,
+ "new_distributions": [],
+ }
+
+ perturbed_corr, _ = apply_per_simulation_perturbations(corr, var_types, config, sim_seed=42)
+ np.testing.assert_array_equal(perturbed_corr, corr)
+
+
+class TestDistributionPerturbation:
+ """Variable type swaps in scenario mode."""
+
+ def test_distribution_swap_occurs(self):
+ var_types = np.zeros(10, dtype=np.int64) # All normal
+ config = {
+ "correlation_noise_sd": 0.0,
+ "distribution_change_prob": 1.0, # Always swap
+ "new_distributions": ["right_skewed"],
+ }
+
+ _, perturbed_types = apply_per_simulation_perturbations(
+ np.eye(10), var_types, config, sim_seed=42,
+ )
+ # All should be swapped from 0 to 2 (right_skewed)
+ assert np.all(perturbed_types == 2)
+
+ def test_non_normal_not_swapped(self):
+ """Binary (1) and uploaded (99) vars should not be swapped."""
+ var_types = np.array([0, 1, 99], dtype=np.int64)
+ config = {
+ "correlation_noise_sd": 0.0,
+ "distribution_change_prob": 1.0,
+ "new_distributions": ["right_skewed"],
+ }
+
+ _, perturbed_types = apply_per_simulation_perturbations(
+ np.eye(3), var_types, config, sim_seed=42,
+ )
+ assert perturbed_types[0] == 2 # normal → right_skewed
+ assert perturbed_types[1] == 1 # binary unchanged
+ assert perturbed_types[2] == 99 # uploaded unchanged
+
+ def test_none_config_passthrough(self):
+ corr = np.eye(2)
+ var_types = np.zeros(2, dtype=np.int64)
+ result_corr, result_types = apply_per_simulation_perturbations(
+ corr, var_types, None, sim_seed=42,
+ )
+ np.testing.assert_array_equal(result_corr, corr)
+ np.testing.assert_array_equal(result_types, var_types)
+
+
+class TestLMEPerturbations:
+ """LME perturbation computation."""
+
+ def test_icc_noise_creates_multipliers(self):
+ cluster_specs = {"school": {"n_clusters": 20, "cluster_size": 10, "icc": 0.2}}
+ config = {
+ "icc_noise_sd": 0.3,
+ "random_effect_dist": "normal",
+ "random_effect_df": 5,
+ }
+
+ result = apply_lme_perturbations(cluster_specs, config, sim_seed=42)
+ assert result is not None
+ assert "tau_squared_multipliers" in result
+ assert "school" in result["tau_squared_multipliers"]
+ # Multiplier should be exp(N(0, 0.3)) — positive, around 1
+ mult = result["tau_squared_multipliers"]["school"]
+ assert mult > 0
+
+ def test_no_perturbation_returns_none(self):
+ cluster_specs = {"school": {"n_clusters": 20, "cluster_size": 10, "icc": 0.2}}
+ config = {
+ "icc_noise_sd": 0.0,
+ "random_effect_dist": "normal",
+ "random_effect_df": 5,
+ }
+ result = apply_lme_perturbations(cluster_specs, config, sim_seed=42)
+ assert result is None
+
+ def test_empty_cluster_specs_returns_none(self):
+ result = apply_lme_perturbations({}, {"icc_noise_sd": 0.5}, sim_seed=42)
+ assert result is None
+
+ def test_heavy_tailed_re_dist(self):
+ cluster_specs = {"school": {"n_clusters": 20, "cluster_size": 10, "icc": 0.2}}
+ config = {
+ "icc_noise_sd": 0.0,
+ "random_effect_dist": "heavy_tailed",
+ "random_effect_df": 3,
+ }
+ result = apply_lme_perturbations(cluster_specs, config, sim_seed=42)
+ assert result is not None
+ assert result["random_effect_dist"] == "heavy_tailed"
+ assert result["random_effect_df"] == 3
+
+
+class TestScenarioRunnerPlots:
+ """Test _create_scenario_plots path."""
+
+ def test_plot_creation_with_mock(self):
+ model = MagicMock()
+ model.power = 80.0
+ runner = ScenarioRunner(model)
+
+ results = {
+ "analysis_type": "sample_size",
+ "scenarios": {
+ "optimistic": {
+ "model": {
+ "target_tests": ["x1"],
+ "correction": None,
+ },
+ "results": {
+ "sample_sizes_tested": [50, 100],
+ "powers_by_test": {"x1": [50.0, 85.0]},
+ "first_achieved": {"x1": 100},
+ },
+ },
+ },
+ }
+
+ with patch("mcpower.core.scenarios._create_power_plot") as mock_plot:
+ runner._create_scenario_plots(results)
+ mock_plot.assert_called_once()
+
+ def test_plot_with_correction(self):
+ model = MagicMock()
+ model.power = 80.0
+ runner = ScenarioRunner(model)
+
+ results = {
+ "analysis_type": "sample_size",
+ "scenarios": {
+ "optimistic": {
+ "model": {
+ "target_tests": ["x1"],
+ "correction": "bonferroni",
+ },
+ "results": {
+ "sample_sizes_tested": [50, 100],
+ "powers_by_test": {"x1": [50.0, 85.0]},
+ "powers_by_test_corrected": {"x1": [40.0, 75.0]},
+ "first_achieved": {"x1": 100},
+ "first_achieved_corrected": {"x1": 150},
+ },
+ },
+ },
+ }
+
+ with patch("mcpower.core.scenarios._create_power_plot") as mock_plot:
+ runner._create_scenario_plots(results)
+ # Should be called for both uncorrected and corrected
+ assert mock_plot.call_count == 2
+
+ def test_no_plot_when_missing_sample_sizes(self):
+ model = MagicMock()
+ model.power = 80.0
+ runner = ScenarioRunner(model)
+
+ results = {
+ "scenarios": {
+ "optimistic": {
+ "results": {"powers_by_test": {"x1": [50.0]}},
+ },
+ },
+ }
+
+ with patch("mcpower.core.scenarios._create_power_plot") as mock_plot:
+ runner._create_scenario_plots(results)
+ mock_plot.assert_not_called()
diff --git a/tests/unit/test_simulation_coverage.py b/tests/unit/test_simulation_coverage.py
new file mode 100644
index 0000000..8bae7ff
--- /dev/null
+++ b/tests/unit/test_simulation_coverage.py
@@ -0,0 +1,274 @@
+"""Tests for simulation.py — failure handling, Wald fallback, verbose diagnostics, ICC mismatch."""
+
+import warnings
+from typing import Dict, List, Optional
+from unittest.mock import MagicMock, patch
+
+import numpy as np
+import pytest
+
+from mcpower.core.simulation import SimulationMetadata, SimulationRunner, _warn_icc_mismatch
+
+
+def _make_metadata(
+ n_targets=2,
+ cluster_specs=None,
+ verbose=False,
+ correction_method=0,
+):
+ """Create a minimal SimulationMetadata for testing."""
+ return SimulationMetadata(
+ target_indices=np.arange(n_targets),
+ n_non_factor_vars=n_targets,
+ correlation_matrix=np.eye(n_targets),
+ var_types=np.zeros(n_targets, dtype=np.int64),
+ var_params=np.zeros(n_targets, dtype=np.float64),
+ factor_specs=[],
+ upload_normal_values=np.zeros((2, 2), dtype=np.float64),
+ upload_data_values=np.zeros((2, 2), dtype=np.float64),
+ effect_sizes=np.array([0.5] * n_targets),
+ correction_method=correction_method,
+ cluster_specs=cluster_specs or {},
+ verbose=verbose,
+ )
+
+
+def _noop_perturbations(corr, types, config, seed):
+ return corr, types
+
+
+class TestAllSimulationsFail:
+ """When all simulations return None, RuntimeError should be raised."""
+
+ def test_all_fail_raises(self):
+ runner = SimulationRunner(n_simulations=5, seed=42)
+ metadata = _make_metadata()
+
+ def failing_sim(*args, **kwargs):
+ return None
+
+ with patch.object(runner, "_single_simulation", return_value=None):
+ with pytest.raises(RuntimeError, match="All simulations failed"):
+ runner.run_power_simulations(
+ sample_size=100,
+ metadata=metadata,
+ generate_y_func=MagicMock(),
+ analyze_func=MagicMock(),
+ create_X_extended_func=MagicMock(),
+ apply_perturbations_func=_noop_perturbations,
+ )
+
+
+class TestLMEThresholdExceeded:
+ """LME failure rate exceeding threshold raises RuntimeError."""
+
+ def test_high_failure_rate_raises(self):
+ runner = SimulationRunner(n_simulations=10, seed=42, max_failed_simulations=0.05)
+ metadata = _make_metadata(cluster_specs={"school": {"n_clusters": 5, "cluster_size": 10}})
+
+ call_count = [0]
+
+ def sometimes_fail(*args, **kwargs):
+ call_count[0] += 1
+ if call_count[0] <= 5:
+ return None # 5 out of 10 fail = 50%
+ return (np.array([1, 1, 1]), np.array([1, 1, 1]), False)
+
+ with patch.object(runner, "_single_simulation", side_effect=sometimes_fail):
+ with pytest.raises(RuntimeError, match="Too many failed simulations"):
+ runner.run_power_simulations(
+ sample_size=100,
+ metadata=metadata,
+ generate_y_func=MagicMock(),
+ analyze_func=MagicMock(),
+ create_X_extended_func=MagicMock(),
+ apply_perturbations_func=_noop_perturbations,
+ )
+
+
+class TestOLSHighFailureWarns:
+ """OLS high failure rate warns but doesn't raise."""
+
+ def test_ols_warns_above_10_percent(self):
+ runner = SimulationRunner(n_simulations=10, seed=42)
+ metadata = _make_metadata() # No cluster_specs = OLS
+
+ call_count = [0]
+
+ def sometimes_fail(*args, **kwargs):
+ call_count[0] += 1
+ if call_count[0] <= 2:
+ return None # 2 out of 10 fail = 20%
+ return (np.array([1, 1, 1]), np.array([1, 1, 1]))
+
+ with patch.object(runner, "_single_simulation", side_effect=sometimes_fail):
+ with warnings.catch_warnings(record=True) as w:
+ warnings.simplefilter("always")
+ result = runner.run_power_simulations(
+ sample_size=100,
+ metadata=metadata,
+ generate_y_func=MagicMock(),
+ analyze_func=MagicMock(),
+ create_X_extended_func=MagicMock(),
+ apply_perturbations_func=_noop_perturbations,
+ )
+ assert any("failed" in str(warning.message).lower() for warning in w)
+
+
+class TestWaldFallbackWarning:
+ """Warn if >10% iterations use Wald test."""
+
+ def test_wald_warning_above_threshold(self):
+ runner = SimulationRunner(n_simulations=10, seed=42)
+ metadata = _make_metadata()
+
+ call_count = [0]
+
+ def wald_heavy(*args, **kwargs):
+ call_count[0] += 1
+ # All return wald_flag=True
+ return (np.array([1, 1, 1]), np.array([1, 1, 1]), True)
+
+ with patch.object(runner, "_single_simulation", side_effect=wald_heavy):
+ with warnings.catch_warnings(record=True) as w:
+ warnings.simplefilter("always")
+ result = runner.run_power_simulations(
+ sample_size=100,
+ metadata=metadata,
+ generate_y_func=MagicMock(),
+ analyze_func=MagicMock(),
+ create_X_extended_func=MagicMock(),
+ apply_perturbations_func=_noop_perturbations,
+ )
+ assert any("Wald test fallback" in str(warning.message) for warning in w)
+ assert result["n_wald_fallbacks"] == 10
+
+
+class TestVerboseDiagnostics:
+ """Verbose mode collects diagnostics and failure reasons."""
+
+ def test_verbose_success_collects_diagnostics(self):
+ runner = SimulationRunner(n_simulations=3, seed=42)
+ metadata = _make_metadata(verbose=True)
+
+ def verbose_result(*args, **kwargs):
+ return {
+ "results": (np.array([1, 1, 1]), np.array([1, 1, 1])),
+ "diagnostics": {"icc_estimated": 0.2},
+ "wald_fallback": False,
+ }
+
+ with patch.object(runner, "_single_simulation", side_effect=verbose_result):
+ result = runner.run_power_simulations(
+ sample_size=100,
+ metadata=metadata,
+ generate_y_func=MagicMock(),
+ analyze_func=MagicMock(),
+ create_X_extended_func=MagicMock(),
+ apply_perturbations_func=_noop_perturbations,
+ )
+ assert "diagnostics" in result
+ assert len(result["diagnostics"]) == 3
+
+ def test_verbose_failure_tracking(self):
+ runner = SimulationRunner(n_simulations=5, seed=42)
+ metadata = _make_metadata(verbose=True)
+
+ call_count = [0]
+
+ def mixed_results(*args, **kwargs):
+ call_count[0] += 1
+ if call_count[0] <= 2:
+ return {"failed": True, "failure_reason": "Convergence failed"}
+ return {
+ "results": (np.array([1, 1, 1]), np.array([1, 1, 1])),
+ "diagnostics": {},
+ "wald_fallback": False,
+ }
+
+ with patch.object(runner, "_single_simulation", side_effect=mixed_results):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ result = runner.run_power_simulations(
+ sample_size=100,
+ metadata=metadata,
+ generate_y_func=MagicMock(),
+ analyze_func=MagicMock(),
+ create_X_extended_func=MagicMock(),
+ apply_perturbations_func=_noop_perturbations,
+ )
+ assert "failure_reasons" in result
+ assert result["failure_reasons"]["Convergence failed"] == 2
+
+ def test_verbose_none_tracking(self):
+ """None results in verbose mode are tracked as unknown failures."""
+ runner = SimulationRunner(n_simulations=3, seed=42)
+ metadata = _make_metadata(verbose=True)
+
+ call_count = [0]
+
+ def mixed(*args, **kwargs):
+ call_count[0] += 1
+ if call_count[0] == 1:
+ return None
+ return {
+ "results": (np.array([1, 1, 1]), np.array([1, 1, 1])),
+ "diagnostics": {},
+ "wald_fallback": False,
+ }
+
+ with patch.object(runner, "_single_simulation", side_effect=mixed):
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", UserWarning)
+ result = runner.run_power_simulations(
+ sample_size=100,
+ metadata=metadata,
+ generate_y_func=MagicMock(),
+ analyze_func=MagicMock(),
+ create_X_extended_func=MagicMock(),
+ apply_perturbations_func=_noop_perturbations,
+ )
+ assert "Unknown (returned None)" in result["failure_reasons"]
+
+
+class TestICCMismatchWarning:
+ """ICC mismatch warning when estimated ICC differs by >50%."""
+
+ def test_large_mismatch_warns(self):
+ metadata = _make_metadata(
+ cluster_specs={"school": {"icc": 0.2, "n_clusters": 20, "cluster_size": 10}},
+ )
+ with warnings.catch_warnings(record=True) as w:
+ warnings.simplefilter("always")
+ _warn_icc_mismatch(metadata, mean_estimated_icc=0.05) # 75% deviation
+ assert any("differs from specified" in str(warning.message) for warning in w)
+
+ def test_within_tolerance_no_warning(self):
+ metadata = _make_metadata(
+ cluster_specs={"school": {"icc": 0.2, "n_clusters": 20, "cluster_size": 10}},
+ )
+ with warnings.catch_warnings(record=True) as w:
+ warnings.simplefilter("always")
+ _warn_icc_mismatch(metadata, mean_estimated_icc=0.18) # 10% deviation
+ icc_warnings = [x for x in w if "differs from specified" in str(x.message)]
+ assert len(icc_warnings) == 0
+
+ def test_zero_estimated_icc_no_warning(self):
+ metadata = _make_metadata(
+ cluster_specs={"school": {"icc": 0.2, "n_clusters": 20, "cluster_size": 10}},
+ )
+ with warnings.catch_warnings(record=True) as w:
+ warnings.simplefilter("always")
+ _warn_icc_mismatch(metadata, mean_estimated_icc=0.0)
+ icc_warnings = [x for x in w if "differs from specified" in str(x.message)]
+ assert len(icc_warnings) == 0
+
+ def test_no_icc_in_spec_no_warning(self):
+ metadata = _make_metadata(
+ cluster_specs={"school": {"icc": None, "n_clusters": 20, "cluster_size": 10}},
+ )
+ with warnings.catch_warnings(record=True) as w:
+ warnings.simplefilter("always")
+ _warn_icc_mismatch(metadata, mean_estimated_icc=0.5)
+ icc_warnings = [x for x in w if "differs from specified" in str(x.message)]
+ assert len(icc_warnings) == 0
diff --git a/tests/unit/test_test_formula_utils.py b/tests/unit/test_test_formula_utils.py
new file mode 100644
index 0000000..f3db882
--- /dev/null
+++ b/tests/unit/test_test_formula_utils.py
@@ -0,0 +1,319 @@
+"""Tests for test_formula parsing utilities."""
+
+from collections import OrderedDict
+from unittest.mock import MagicMock
+
+import numpy as np
+
+
+class TestExtractTestFormulaEffects:
+ """Test _extract_test_formula_effects helper."""
+
+ def _make_registry(
+ self,
+ effect_names,
+ factor_names=None,
+ factor_dummies=None,
+ cluster_effect_names=None,
+ ):
+ """Create a minimal mock registry for testing."""
+ reg = MagicMock()
+ reg.effect_names = effect_names
+ reg.factor_names = factor_names or []
+ reg.cluster_effect_names = cluster_effect_names or []
+
+ # Build _effects dict with correct ordering
+ effects = OrderedDict()
+ for name in effect_names:
+ eff = MagicMock()
+ eff.effect_type = "interaction" if ":" in name else "main"
+ effects[name] = eff
+ reg._effects = effects
+
+ # Factor dummies
+ reg._factor_dummies = factor_dummies or {}
+ return reg
+
+ def test_simple_subset(self):
+ """y ~ x1 + x2 from generation y ~ x1 + x2 + x3."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(["x1", "x2", "x3"])
+ effects, random_effects = _extract_test_formula_effects("y ~ x1 + x2", registry)
+ assert effects == ["x1", "x2"]
+ assert random_effects == []
+
+ def test_single_variable(self):
+ """y ~ x1 from generation y ~ x1 + x2 + x3."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(["x1", "x2", "x3"])
+ effects, random_effects = _extract_test_formula_effects("y ~ x1", registry)
+ assert effects == ["x1"]
+
+ def test_with_interaction(self):
+ """y ~ x1 + x2 + x1:x2 from generation y ~ x1 + x2 + x3 + x1:x2."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(["x1", "x2", "x3", "x1:x2"])
+ effects, _ = _extract_test_formula_effects("y ~ x1 + x2 + x1:x2", registry)
+ assert effects == ["x1", "x2", "x1:x2"]
+
+ def test_interaction_omitted(self):
+ """y ~ x1 + x2 from generation y ~ x1 + x2 + x1:x2."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(["x1", "x2", "x1:x2"])
+ effects, _ = _extract_test_formula_effects("y ~ x1 + x2", registry)
+ assert effects == ["x1", "x2"]
+
+ def test_factor_expands_to_dummies(self):
+ """y ~ x1 + gender from generation y ~ x1 + x2 + gender."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(
+ ["x1", "x2", "gender[F]", "gender[Other]"],
+ factor_names=["gender"],
+ factor_dummies={
+ "gender[F]": {"factor_name": "gender", "level": "F"},
+ "gender[Other]": {"factor_name": "gender", "level": "Other"},
+ },
+ )
+ effects, _ = _extract_test_formula_effects("y ~ x1 + gender", registry)
+ assert effects == ["x1", "gender[F]", "gender[Other]"]
+
+ def test_factor_omitted(self):
+ """y ~ x1 from generation y ~ x1 + gender."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(
+ ["x1", "gender[F]", "gender[Other]"],
+ factor_names=["gender"],
+ factor_dummies={
+ "gender[F]": {"factor_name": "gender", "level": "F"},
+ "gender[Other]": {"factor_name": "gender", "level": "Other"},
+ },
+ )
+ effects, _ = _extract_test_formula_effects("y ~ x1", registry)
+ assert effects == ["x1"]
+
+ def test_with_random_effects(self):
+ """y ~ x1 + (1|school) extracts random effects."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(["x1", "x2"])
+ effects, random_effects = _extract_test_formula_effects(
+ "y ~ x1 + (1|school)", registry
+ )
+ assert effects == ["x1"]
+ assert len(random_effects) == 1
+ assert random_effects[0]["grouping_var"] == "school"
+
+ def test_star_operator_expands(self):
+ """y ~ x1*x2 expands to x1 + x2 + x1:x2."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(["x1", "x2", "x3", "x1:x2"])
+ effects, _ = _extract_test_formula_effects("y ~ x1*x2", registry)
+ assert effects == ["x1", "x2", "x1:x2"]
+
+ def test_equals_sign_formula(self):
+ """y = x1 + x2 works same as y ~ x1 + x2."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(["x1", "x2", "x3"])
+ effects, _ = _extract_test_formula_effects("y = x1 + x2", registry)
+ assert effects == ["x1", "x2"]
+
+ def test_preserves_registry_order(self):
+ """Effects returned in registry order, not formula order."""
+ from mcpower.utils.test_formula_utils import _extract_test_formula_effects
+
+ registry = self._make_registry(["x1", "x2", "x3", "x1:x2"])
+ # Formula lists x2 before x1
+ effects, _ = _extract_test_formula_effects("y ~ x2 + x1", registry)
+ assert effects == ["x1", "x2"] # registry order preserved
+
+
+class TestComputeTestColumnIndices:
+ """Test _compute_test_column_indices helper."""
+
+ def test_subset_two_of_three(self):
+ """Selecting 2 of 3 effects gives correct indices."""
+ from mcpower.utils.test_formula_utils import _compute_test_column_indices
+
+ all_effect_names = ["x1", "x2", "x3"]
+ test_effect_names = ["x1", "x2"]
+ result = _compute_test_column_indices(all_effect_names, test_effect_names)
+ assert list(result) == [0, 1]
+
+ def test_skip_middle(self):
+ """Selecting first and last of 3 effects."""
+ from mcpower.utils.test_formula_utils import _compute_test_column_indices
+
+ all_effect_names = ["x1", "x2", "x3"]
+ test_effect_names = ["x1", "x3"]
+ result = _compute_test_column_indices(all_effect_names, test_effect_names)
+ assert list(result) == [0, 2]
+
+ def test_single_effect(self):
+ """Single effect selected."""
+ from mcpower.utils.test_formula_utils import _compute_test_column_indices
+
+ all_effect_names = ["x1", "x2", "x3"]
+ test_effect_names = ["x2"]
+ result = _compute_test_column_indices(all_effect_names, test_effect_names)
+ assert list(result) == [1]
+
+ def test_all_effects_returns_all_indices(self):
+ """Selecting all effects returns full range."""
+ from mcpower.utils.test_formula_utils import _compute_test_column_indices
+
+ all_effect_names = ["x1", "x2", "x3"]
+ test_effect_names = ["x1", "x2", "x3"]
+ result = _compute_test_column_indices(all_effect_names, test_effect_names)
+ assert list(result) == [0, 1, 2]
+
+ def test_with_interactions(self):
+ """Interaction effects have correct indices."""
+ from mcpower.utils.test_formula_utils import _compute_test_column_indices
+
+ all_effect_names = ["x1", "x2", "x3", "x1:x2"]
+ test_effect_names = ["x1", "x2", "x1:x2"]
+ result = _compute_test_column_indices(all_effect_names, test_effect_names)
+ assert list(result) == [0, 1, 3]
+
+
+class TestRemapTargetIndices:
+ """Test _remap_target_indices helper."""
+
+ def test_simple_remap(self):
+ """Target indices remapped to positions within test columns."""
+ from mcpower.utils.test_formula_utils import _remap_target_indices
+
+ # Original target_indices: [0, 1] (x1, x2 in full model)
+ # test_column_indices: [0, 1] (x1, x2 at positions 0, 1 in X_expanded)
+ # In X_test, x1 is at 0, x2 is at 1 -> remapped: [0, 1]
+ original = np.array([0, 1])
+ test_cols = np.array([0, 1])
+ result = _remap_target_indices(original, test_cols)
+ assert list(result) == [0, 1]
+
+ def test_remap_with_gap(self):
+ """Target indices remapped when test columns skip positions."""
+ from mcpower.utils.test_formula_utils import _remap_target_indices
+
+ # Full model: [x1=0, x2=1, x3=2, x1:x2=3]
+ # Test model: [x1=0, x1:x2=3] -> X_test columns at [0, 3]
+ # target_test="x1" -> original target_indices=[0]
+ # In X_test, x1 is at position 0 -> remapped: [0]
+ original = np.array([0])
+ test_cols = np.array([0, 3])
+ result = _remap_target_indices(original, test_cols)
+ assert list(result) == [0]
+
+ def test_remap_target_at_end(self):
+ """Target index that moves to different position in X_test."""
+ from mcpower.utils.test_formula_utils import _remap_target_indices
+
+ # Full model: [x1=0, x2=1, x3=2]
+ # Test model: [x2=1, x3=2] -> test_column_indices=[1, 2]
+ # target_test="x3" -> original target_indices=[2]
+ # In X_test, x3 is at position 1 (second column) -> remapped: [1]
+ original = np.array([2])
+ test_cols = np.array([1, 2])
+ result = _remap_target_indices(original, test_cols)
+ assert list(result) == [1]
+
+
+class TestPrepareMetadataWithTestFormula:
+ """Integration test: prepare_metadata with test_formula_effects."""
+
+ def test_metadata_has_test_indices_when_provided(self):
+ from mcpower import MCPower
+ from mcpower.core.simulation import prepare_metadata
+
+ model = MCPower("y = x1 + x2 + x3")
+ model.set_effects("x1=0.5, x2=0.3, x3=0.2")
+ model._apply()
+
+ metadata = prepare_metadata(model, ["x1", "x2"], test_formula_effects=["x1", "x2"])
+ assert metadata.test_column_indices is not None
+ assert list(metadata.test_column_indices) == [0, 1]
+ assert metadata.test_target_indices is not None
+ assert metadata.test_effect_count == 2
+
+ def test_metadata_no_test_indices_by_default(self):
+ from mcpower import MCPower
+ from mcpower.core.simulation import prepare_metadata
+
+ model = MCPower("y = x1 + x2")
+ model.set_effects("x1=0.5, x2=0.3")
+ model._apply()
+
+ metadata = prepare_metadata(model, ["x1", "x2"])
+ assert metadata.test_column_indices is None
+
+ def test_remap_skips_targets_not_in_test_formula(self):
+ from mcpower import MCPower
+ from mcpower.core.simulation import prepare_metadata
+
+ model = MCPower("y = x1 + x2 + x3")
+ model.set_effects("x1=0.5, x2=0.3, x3=0.2")
+ model._apply()
+
+ # target_tests = all 3, but test formula only has x1, x2
+ metadata = prepare_metadata(model, ["x1", "x2", "x3"], test_formula_effects=["x1", "x2"])
+ # test_target_indices should only have indices for x1 and x2 in X_test
+ assert len(metadata.test_target_indices) == 2
+
+
+class TestParseTargetTestsWithTestFormula:
+ """Test _parse_target_tests limits 'all' when test_formula is active."""
+
+ def test_all_expands_to_test_formula_effects_only(self):
+ from mcpower import MCPower
+
+ model = MCPower("y = x1 + x2 + x3")
+ model.set_effects("x1=0.5, x2=0.3, x3=0.2")
+ model._apply()
+
+ result = model._parse_target_tests("all", test_formula_effects=["x1", "x2"])
+ assert "x3" not in result
+ assert "x1" in result
+ assert "x2" in result
+ assert "overall" in result
+
+ def test_explicit_target_not_in_test_formula_raises(self):
+ from mcpower import MCPower
+
+ import pytest
+
+ model = MCPower("y = x1 + x2 + x3")
+ model.set_effects("x1=0.5, x2=0.3, x3=0.2")
+ model._apply()
+
+ with pytest.raises(ValueError, match="x3"):
+ model._parse_target_tests("x3", test_formula_effects=["x1", "x2"])
+
+ def test_overall_always_allowed(self):
+ from mcpower import MCPower
+
+ model = MCPower("y = x1 + x2 + x3")
+ model.set_effects("x1=0.5, x2=0.3, x3=0.2")
+ model._apply()
+
+ result = model._parse_target_tests("overall", test_formula_effects=["x1", "x2"])
+ assert "overall" in result
+
+ def test_no_test_formula_uses_all_effects(self):
+ from mcpower import MCPower
+
+ model = MCPower("y = x1 + x2 + x3")
+ model.set_effects("x1=0.5, x2=0.3, x3=0.2")
+ model._apply()
+
+ result = model._parse_target_tests("all")
+ assert "x1" in result
+ assert "x2" in result
+ assert "x3" in result
diff --git a/tests/unit/test_updates.py b/tests/unit/test_updates.py
index ee7a2a0..90b3003 100644
--- a/tests/unit/test_updates.py
+++ b/tests/unit/test_updates.py
@@ -101,14 +101,17 @@ def test_shows_warning_when_newer(self, monkeypatch):
"""Show warning when PyPI version is newer."""
monkeypatch.delenv("_MCPOWER_UPDATE_CHECKED", raising=False)
- # Write a cache file at the path the installed module actually reads from
- from datetime import datetime
-
import mcpower.utils.updates as upd_mod
+
+ # Reset the module-level dedup flag
+ upd_mod._already_checked = False
+
+ # Write a cache file at the path the module actually reads from
+ from datetime import datetime
from pathlib import Path
- cache_path = Path(upd_mod.__file__).parent.parent / ".mcpower_cache.json"
- cache_path.parent.mkdir(exist_ok=True)
+ cache_path = Path.home() / ".cache" / "mcpower" / "update_cache.json"
+ cache_path.parent.mkdir(parents=True, exist_ok=True)
cache_data = {
"last_check": datetime.now().isoformat(),
"latest_version": "99.0.0",
@@ -120,5 +123,6 @@ def test_shows_warning_when_newer(self, monkeypatch):
with pytest.warns(match="NEW MCPower VERSION"):
_check_for_updates("1.0.0")
finally:
- # Clean up the cache file
+ # Clean up the cache file and reset flag
cache_path.unlink(missing_ok=True)
+ upd_mod._already_checked = False
diff --git a/tests/unit/test_upload_data_utils.py b/tests/unit/test_upload_data_utils.py
new file mode 100644
index 0000000..c498b6a
--- /dev/null
+++ b/tests/unit/test_upload_data_utils.py
@@ -0,0 +1,62 @@
+"""Unit tests for mcpower.utils.upload_data_utils — normalize_upload_input."""
+
+import numpy as np
+import pytest
+
+from mcpower.utils.upload_data_utils import normalize_upload_input
+
+
+class TestNormalizeUploadInput:
+ """Tests for normalize_upload_input."""
+
+ def test_dict_input(self):
+ data = {"x1": [1.0, 2.0, 3.0], "x2": [4.0, 5.0, 6.0]}
+ arr, cols = normalize_upload_input(data)
+ assert cols == ["x1", "x2"]
+ assert arr.shape == (3, 2)
+ np.testing.assert_array_equal(arr[:, 0], [1.0, 2.0, 3.0])
+
+ def test_dict_with_strings(self):
+ data = {"group": ["a", "b", "a"], "x1": [1.0, 2.0, 3.0]}
+ arr, cols = normalize_upload_input(data)
+ assert arr.dtype == object
+ assert cols == ["group", "x1"]
+
+ def test_list_input(self):
+ data = [1.0, 2.0, 3.0]
+ arr, cols = normalize_upload_input(data)
+ assert arr.shape == (3, 1)
+ assert cols == ["column_1"]
+
+ def test_1d_array(self):
+ data = np.array([1.0, 2.0, 3.0])
+ arr, cols = normalize_upload_input(data)
+ assert arr.shape == (3, 1)
+ assert cols == ["column_1"]
+
+ def test_2d_array(self):
+ data = np.array([[1.0, 2.0], [3.0, 4.0]])
+ arr, cols = normalize_upload_input(data)
+ assert arr.shape == (2, 2)
+ assert cols == ["column_1", "column_2"]
+
+ def test_2d_array_with_columns(self):
+ data = np.array([[1.0, 2.0], [3.0, 4.0]])
+ arr, cols = normalize_upload_input(data, columns=["a", "b"])
+ assert cols == ["a", "b"]
+
+ def test_dataframe_input(self):
+ pd = pytest.importorskip("pandas")
+ df = pd.DataFrame({"x1": [1.0, 2.0], "x2": [3.0, 4.0]})
+ arr, cols = normalize_upload_input(df)
+ assert cols == ["x1", "x2"]
+ assert arr.shape == (2, 2)
+
+ def test_mismatched_columns_raises(self):
+ data = np.array([[1.0, 2.0], [3.0, 4.0]])
+ with pytest.raises(ValueError, match="columns length"):
+ normalize_upload_input(data, columns=["a", "b", "c"])
+
+ def test_unsupported_type_raises(self):
+ with pytest.raises(TypeError, match="data must be"):
+ normalize_upload_input("not valid data")
diff --git a/tests/unit/test_utils_mixed_models.py b/tests/unit/test_utils_mixed_models.py
new file mode 100644
index 0000000..de4d98a
--- /dev/null
+++ b/tests/unit/test_utils_mixed_models.py
@@ -0,0 +1,27 @@
+"""Tests for mcpower.utils.mixed_models backward-compat re-exports."""
+
+import threading
+
+from mcpower.utils.mixed_models import (
+ _lme_analysis_wrapper,
+ _lme_thread_local,
+ reset_warm_start_cache,
+)
+
+
+class TestReExports:
+ """Verify that the backward-compatibility re-exports resolve correctly."""
+
+ def test_lme_analysis_wrapper_is_callable(self):
+ assert callable(_lme_analysis_wrapper)
+
+ def test_lme_thread_local_is_threading_local(self):
+ assert isinstance(_lme_thread_local, threading.local)
+
+ def test_reset_warm_start_cache_is_callable(self):
+ assert callable(reset_warm_start_cache)
+
+ def test_reset_warm_start_cache_clears_params(self):
+ _lme_thread_local.warm_start_params = "dummy"
+ reset_warm_start_cache()
+ assert _lme_thread_local.warm_start_params is None