-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Meta-issue tracking 13 performance optimization opportunities identified in code audit.
Performance Items
| ID | Location | Issue | Expected Gain |
|---|---|---|---|
| P1 | cansim.R:365-379 | fold_in_metadata repeated left_joins | 50-70% |
| P2 | cansim_metadata.R:98-111 | parse_metadata nested loops | 60-80% |
| P3 | cansim_parquet.R:675-715 | cached_tables repeated reads | 65-85% |
| P4 | cansim_metadata.R:127-145 | hierarchy O(n) cycle detection | 40-60% |
| P5 | cansim.R:156-162 | gsub loop in factor conversion | 30-50% |
| P6 | cansim_parquet.R:254-263 | field cache read miss | 70-90% |
| P7 | cansim_parquet.R:219-232 | csv2sqlite transform copies | 25-40% |
| P8 | cansim_vectors.R:20-24 | lapply to vapply | 30-45% |
| P9 | Multiple files | French string constants | 20-35% |
| P10 | cansim.R:64 | unnecessary as_tibble | 5-15% |
| P11 | cansim_metadata.R:123-124 | hash lookup for parents | 80-95% |
| P12 | cansim_vectors.R:244-251 | coordinate metadata loop | 35-50% |
| P13 | cansim.R:885,889,898 | lapply unlist chains | 20-30% |
Proposed Implementation Plan
- PR 4: Hot Paths (P1, P2, P5, P13)
- PR 5: Caching & I/O (P3, P6, P7, P10)
- PR 6: Lookups & Vectorization (P4, P8, P9, P11, P12)
All performance PRs will include microbenchmark results.
From code audit - 35-45% overall throughput improvement potential
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request