Optimise UK simulation run (-63% cold sim time)#251
Closed
nikhilwoodruff wants to merge 1 commit intomainfrom
Closed
Optimise UK simulation run (-63% cold sim time)#251nikhilwoodruff wants to merge 1 commit intomainfrom
nikhilwoodruff wants to merge 1 commit intomainfrom
Conversation
Two changes to the UK model's run() method: 1. Convert MicroDataFrames to plain DataFrames before passing to UKSingleYearDataset. The data pipeline only needs numeric arrays for copying and uprating — MicroDataFrame.copy() triggers expensive O(N²) weight linking that's wasted here. 2. Monkey-patch apply_uprating to skip its defensive deep copy of the entire multi-year dataset. extend_single_year_dataset already copies each year individually, so the second copy is redundant. Benchmarked: cold simulate dropped from 39.6s to 14.8s (-63%), wall total from 46.3s to 21.5s (-54%). Mean household income unchanged (£54,562). All 110 tests pass. Co-Authored-By: Claude <noreply@anthropic.com>
2 tasks
Contributor
|
Closing in favour of the upstream fixes: the redundant copy is now handled in PolicyEngine/policyengine-uk#1523 (merged), and the MicroDataFrame O(N²) overhead is fixed in PolicyEngine/microdf#281 (merged). No changes needed in policyengine.py. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two changes to the UK model's
run()method that together cut cold simulation time from 39.6s to 14.8s (-63%):UKSingleYearDataset. The data pipeline only needs numeric arrays for copying and uprating —MicroDataFrame.copy()triggers expensive O(N²) weight linking that's wasted here.apply_upratingto skip its defensive deep copy of the entire multi-year dataset.extend_single_year_datasetalready copies each year individually, so the second copy is redundant.There's a companion microdf fix (PolicyEngine/microdf#281) that addresses the root cause of the O(N²) weight linking. This PR works independently of that fix but the two are complementary.
Benchmark results (3-run mean):
Test plan