Flaky Tests Question

Flaky tests, such as [test_conditional_prob_inf_given_vl_dist](https://github.com/CERN/CAiMIRA/blob/master/caimira/tests/test_conditional_probability.py#L47C16-L47C16), are due to their non-deterministic nature. Monte Carlo simulations, such as those used in the `baseline_exposure_model` fixture and `test_conditional_prob_inf_given_vl_dist`, fail because they're probabilistic. For modelling this is great - but for unit tests its not 😅 It's a nice feeling to have a 🟢 CI.

All you'd have to do to fix this test is set the random seed, like `np.random.seed(42)`. However, this is misleading if the goal (as I suspect) is to ensure the models accuracy within a certain tolerance.

If the goal _is_ to measure the models accuracy, what if you removed the 3x retry and instead ran the model 10 times, gathering the results each time? Then you could do a statistical analysis against the mean/median/standard deviation/percentiles etc.

Another alternative is to change the fixed `0.002` tolerance. What if you calculated the tolerance based on the number of runs? Could an absolute tolerance of `0.002` be wishful thinking? 

It could also help to log model deviations and store them with a timestamp. Then you can monitor deviations over time.

I'm happy to fix this test (and get that CI ✅). Its a cool project. Just let me know what the expected behavior is & how I can help out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky Tests Question #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flaky Tests Question #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions