Skip to content

Flaky Tests Question #8

@jordan-gillard

Description

@jordan-gillard

Flaky tests, such as test_conditional_prob_inf_given_vl_dist, are due to their non-deterministic nature. Monte Carlo simulations, such as those used in the baseline_exposure_model fixture and test_conditional_prob_inf_given_vl_dist, fail because they're probabilistic. For modelling this is great - but for unit tests its not 😅 It's a nice feeling to have a 🟢 CI.

All you'd have to do to fix this test is set the random seed, like np.random.seed(42). However, this is misleading if the goal (as I suspect) is to ensure the models accuracy within a certain tolerance.

If the goal is to measure the models accuracy, what if you removed the 3x retry and instead ran the model 10 times, gathering the results each time? Then you could do a statistical analysis against the mean/median/standard deviation/percentiles etc.

Another alternative is to change the fixed 0.002 tolerance. What if you calculated the tolerance based on the number of runs? Could an absolute tolerance of 0.002 be wishful thinking?

It could also help to log model deviations and store them with a timestamp. Then you can monitor deviations over time.

I'm happy to fix this test (and get that CI ✅). Its a cool project. Just let me know what the expected behavior is & how I can help out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions