replace SimpleNamespace configs with typed dataclasses and enums by matthewcornell · Pull Request #24 · reichlab/idmodels

matthewcornell · 2026-02-23T21:04:33Z

This PR is part of Operational models refactoring ideas #42. It is paired with the operational-models PR [replace SimpleNamespace configs with typed dataclasses and enums #43].

This PR replaces SimpleNamespace configs with typed dataclasses and enums. It is the first step in Operational models refactoring ideas #42. Specifically we introduce a dedicated config.py module with concrete, typed classes replacing ad-hoc SimpleNamespace objects used for model and run configuration. Key changes:

Add DataSource, Disease, PowerTransform, and PoolingStrategy enums
Add abstract ModelConfig and RunConfig dataclasses with concrete subclasses: SARIXModelConfig, SARIXFourierModelConfig, GBQRModelConfig, SARIXRunConfig, and GBQRRunConfig
Export all config classes and models from __init__.py
Update gbqr.py and sarix.py to consume typed config objects
Update integration tests to construct typed configs instead of SimpleNamespace

We also:

Bump version to 1.3.0
Update pyproject.toml to pin pandas to 2.* (pandas 3.* introduced breaking changes)
Update uv.lock to use latest iddata commit

…ate pyproject.toml to pin pandas to 2.* (pandas 3.* introduced breaking changes). update uv.lock to use latest iddata commit

lshandross

I had a few comments, questions, and places where I think we should get Nick's input, but this looks so much better, Matt!

lshandross · 2026-02-24T16:34:06Z

src/idmodels/config.py

+class Disease(str, Enum):
+    FLU = "flu"
+    COVID = "covid"
+


RSV should be added

lshandross · 2026-02-24T16:35:50Z

src/idmodels/config.py

+
+class PowerTransform(str, Enum):
+    FOURTH_ROOT = "4rt"
+


Agreed, we should have a NONE option as well.

lshandross · 2026-02-24T16:42:01Z

src/idmodels/config.py

+    theta_pooling: PoolingStrategy = PoolingStrategy.NONE
+    sigma_pooling: PoolingStrategy = PoolingStrategy.NONE
+    x: list = field(default_factory=list)
+


I think we should check with @nickreich if these are indeed the defaults we want for the SARIX model class parameters (and we should do the same for the GBQR model class)

I don't feel that strongly about what the defaults are/should be. Defaulting to NONE seems reasonable to me.

lshandross · 2026-02-24T16:44:14Z

src/idmodels/config.py

+
+
+@dataclass
+class SARIXRunConfig(RunConfig):


We discussed whether it made sense to move these parameters to the model class and decided on getting @nickreich 's opinion. I think it could make sense to eliminate SARIX and GBQR run config classes if possible since they store only a few parameters unique to the particular model

Agreed! We could have documentation on some of the enums within the classes for users?

lshandross · 2026-02-24T16:52:10Z

tests/integration/test_sarix.py

    date = datetime.date.fromisoformat("2024-01-06")
    fips_codes = ["US", "01", "02", "04", "05"] # fewer locs for faster testing
-    # model_config = create_test_sarix_model_config(main_source=["nhsn"], theta_pooling="shared", sigma_pooling="none")
+    # model_config = create_test_sarix_model_config(main_source=[DataSource.NHSN], theta_pooling="shared", sigma_pooling="none")


I think that I had left this commented code in by mistake. We can get rid of it

lshandross · 2026-02-24T16:53:26Z

tests/integration/test_sarix.py

    date = datetime.date.fromisoformat("2024-01-06")
    fips_codes = ["US", "01", "02", "04", "05"] # fewer locs for faster testing
-    # model_config = create_test_sarix_model_config(main_source=["nhsn"], theta_pooling="shared", sigma_pooling="none")
+    # model_config = create_test_sarix_model_config(main_source=[DataSource.NHSN], theta_pooling="shared", sigma_pooling="none")


Again, I think we can get rid of this commented code

trobacker · 2026-02-24T20:09:37Z

src/idmodels/config.py

I like this! One thing to keep in mind is that if model development goes through many iterations, we may end up having a lot of enums in the configs. I think we should keep some kind of a doc for their purpose and intention. E.g. what does "Pooling" mean to a user of the model (even if it may be obvious).

trobacker

This is great! I left a comment about documenting the enums for users of the code and supported some of @lshandross's comments. I'll leave approval for one more by Nick for now.

nickreich

A few small comments

nickreich · 2026-03-04T03:09:10Z

src/idmodels/config.py

+    theta_pooling: PoolingStrategy = PoolingStrategy.NONE
+    sigma_pooling: PoolingStrategy = PoolingStrategy.NONE
+    x: list = field(default_factory=list)
+


I don't feel that strongly about what the defaults are/should be. Defaulting to NONE seems reasonable to me.

nickreich · 2026-03-04T03:45:38Z

Re: whether to keep RunConfig subclasses or move inference params to ModelConfig

I think the inference/fitting parameters (num_warmup, num_samples, num_chains for SARIX; num_bags, bag_frac_samples for GBQR) belong on ModelConfig, not a model-specific RunConfig. A few reasons:

They're model-specific. SARIX has warmup/samples/chains, GBQR has bags/bag_frac. These are different knobs for different estimation procedures — they don't belong in a shared RunConfig base class, and having model-specific RunConfig subclasses just to hold them feels like the wrong abstraction.
The current code is already inconsistent. In operational-models, short_run mutates model_config.num_bags for GBQR but run_config.num_warmup for SARIX. Putting them all on ModelConfig resolves this.
It simplifies RunConfig. If inference params move to ModelConfig, the remaining RunConfig fields (disease, ref_date, locations, quantiles, output paths) are truly model-agnostic. You can drop SARIXRunConfig and GBQRRunConfig entirely and have a single RunConfig class.

On formalizing short_run: Short runs are a core workflow (used in testing, CI, and development), so it's worth making them a documented part of the model config rather than leaving them as ad-hoc mutations in caller scripts. A lightweight way to do this: put production defaults on the ModelConfig fields, and add a short_run() method that each model subclass defines:

@dataclass
class SARIXModelConfig(ModelConfig):
    # ... structural params ...
    num_warmup: int = 2000
    num_samples: int = 2000
    num_chains: int = 1

    def short_run(self):
        """Return a copy with reduced inference intensity for testing."""
        return dataclasses.replace(self, num_warmup=100, num_samples=100)

@dataclass
class GBQRModelConfig(ModelConfig):
    # ... structural params ...
    num_bags: int = 100
    bag_frac_samples: float = 0.7

    def short_run(self):
        return dataclasses.replace(self, num_bags=10)

This way each model defines what "short run" means for itself, it's discoverable/documented, and the caller just does model_config.short_run() without needing to know which knobs to turn. The q_levels reduction for short runs would stay in the caller since that's an output format concern, not a model intensity concern.

nickreich · 2026-03-04T03:47:03Z

The above was drafted (obviously, I think) with Claude. But with substantial steering from me, especially towards the part of thinking of the short_run flag as part of how a model is defined formally.

matthewcornell · 2026-03-05T22:05:38Z

Thanks, @nickreich ! I've integrated your suggestions into the plan at this comment.

matthewcornell · 2026-03-10T22:22:23Z

@nickreich Re: whether to keep RunConfig subclasses or move inference params to ModelConfig:

I think the inference/fitting parameters (num_warmup, num_samples, num_chains for SARIX; num_bags, bag_frac_samples for GBQR) belong on ModelConfig, not a model-specific RunConfig.

num_warmup, num_samples, num_chains are currently in SARIXRunConfig, and I agree they should be moved to SARIXModelConfig, and then delete SARIXRunConfig. This is as you outlined. However gbqr's num_bags and bag_frac_samples are already in GBQRModelConfig. I wonder if you meant that save_feat_importance should be moved from GBQRRunConfig to GBQRModelConfig and then delete GBQRRunConfig?

nickreich · 2026-03-11T01:35:27Z

I mean, I think we need to move it to the GBQRModelConfig if we want to remove the RunConfig objects. It doesn't feel like the save_feat_importance "fits" there as well as the others since it's not really a model parameter, but I guess it's just ok and we should make the move.

matthewcornell · 2026-03-11T18:00:57Z

I mean, I think we need to move [save_feat_importance] to the GBQRModelConfig if we want to remove the RunConfig objects. It doesn't feel like the save_feat_importance "fits" there as well as the others since it's not really a model parameter, but I guess it's just ok and we should make the move.

@lshandross , @trobacker What are your thoughts about save_feat_importance? I'm a little concerned that Nick's not seeing an obvious place for it to go.

lshandross · 2026-03-11T20:23:23Z

Another suggestion would be to make it an optional argument (I don't know if that's the correct terminology) in the general run config that only gets used if the model is a GBQR. But I don't know if I have a strong preference between my suggestion and Nick's

…o corresponding *ModelConfig classes. made RunConfig concrete. added Disease.RSV, PowerTransform.NONE, updated gbqr.py to handle Disease.RSV and PowerTransform.NONE

matthewcornell · 2026-03-12T20:15:38Z

I've updated the code to include suggested changes. Please review d74fe38 .

lshandross

I have one comment but it's not blocking. Thank you for making the changes, Matt!

lshandross · 2026-03-13T16:44:29Z

tests/integration/test_sarix.py


-def create_test_sarix_run_config(ref_date, states, hsas, num, tmp_path):
-    run_config = SARIXRunConfig(
+def create_test_sarix_run_config(ref_date, states, hsas, tmp_path):


I wonder if this helper function should be taken out of this script and moved into a separate one (the ones to create sarix and gbqr model configs could also be moved here), as well as renamed to be just create_test_run_config which can be reused by both sarix and gbqr models. If others agree, then maybe we can file an issue for it since it doesn't really belong in this PR

matthewcornell added 3 commits February 23, 2026 16:00

replace SimpleNamespace configs with typed dataclasses and enums. upd…

0d0698a

…ate pyproject.toml to pin pandas to 2.* (pandas 3.* introduced breaking changes). update uv.lock to use latest iddata commit

fix pyproject.toml @ git syntax (caused CI run-checks to fail)

5b3ca1c

fix ruff lint errors (Q000 quotes, I001 import order)

de6be08

matthewcornell mentioned this pull request Feb 24, 2026

replace SimpleNamespace configs with typed dataclasses and enums reichlab/operational-models#43

Open

lshandross requested changes Feb 24, 2026

View reviewed changes

trobacker reviewed Feb 24, 2026

View reviewed changes

nickreich reviewed Mar 4, 2026

View reviewed changes

removed SARIXRunConfig and GBQRRunConfig, moving instance variables t…

d74fe38

…o corresponding *ModelConfig classes. made RunConfig concrete. added Disease.RSV, PowerTransform.NONE, updated gbqr.py to handle Disease.RSV and PowerTransform.NONE

lshandross approved these changes Mar 13, 2026

View reviewed changes

Conversation

matthewcornell commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lshandross left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

trobacker Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

trobacker left a comment

Choose a reason for hiding this comment

Uh oh!

nickreich left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nickreich commented Mar 4, 2026

Uh oh!

nickreich commented Mar 4, 2026

Uh oh!

matthewcornell commented Mar 5, 2026

Uh oh!

matthewcornell commented Mar 10, 2026

Uh oh!

nickreich commented Mar 11, 2026

Uh oh!

matthewcornell commented Mar 11, 2026

Uh oh!

lshandross commented Mar 11, 2026

Uh oh!

matthewcornell commented Mar 12, 2026

Uh oh!

lshandross left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

matthewcornell commented Feb 23, 2026 •

edited

Loading

trobacker Feb 24, 2026 •

edited

Loading