perf(engine): cache compiled routines in Thread.loadNext()#250
Open
killerdevildog wants to merge 1 commit intopmgl:masterfrom
Open
perf(engine): cache compiled routines in Thread.loadNext()#250killerdevildog wants to merge 1 commit intopmgl:masterfrom
killerdevildog wants to merge 1 commit intopmgl:masterfrom
Conversation
Bottleneck pmgl#1: loadNext() receives strings like "update()" and "draw()" and creates a new Parser, parses, creates a new Compiler, and compiles every single frame. At 60fps this is 120+ full parse→compile cycles/sec for code that never changes. Each cycle allocates a Tokenizer (with lookup tables), Parser (with api_reserved), AST nodes, Compiler, opcodes arrays, and label maps — all immediately garbage collected. Fix: Add a Map<string, Routine> cache (call_cache) to Thread. On first encounter of a call string, parse and compile as before, then store the compiled Routine in the cache. On subsequent frames, return the cached Routine directly via Map.get(), bypassing the entire pipeline. The cached routines are semantically safe to reuse — they compile to "call by name" instructions that resolve the actual function body at runtime via context.global, so source code changes are picked up without cache invalidation. Benchmark results (Vitest/Tinybench): Before: 375,660 ops/sec (full frame parse+compile update()+draw()) After: 17,222,238 ops/sec (cached Map.get) Speedup: ~46x faster, GC pressure eliminated Files changed: - runner.coffee: source of truth (CoffeeScript) - runner.js: compiled output (ES6 class syntax)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
perf(engine): cache compiled routines in Thread.loadNext()
Summary
Fixes a critical per-frame performance bottleneck in the microScript v2 runtime engine.
Thread.loadNext()was re-parsing and re-compiling call strings like"update()"and"draw()"every single frame, despite the strings never changing.Problem
At 60fps,
loadNext()creates 120+ full parse→compile cycles per second for static call strings. Each cycle allocates:Tokenizerwith lookup tablesParserwithapi_reservedarraysCompilerwith opcodes arrays and label mapsAll of these objects are immediately garbage collected, creating unnecessary GC pressure and wasted CPU time.
Solution
Added a
Map<string, Routine>cache (call_cache) toThread. On the first encounter of a call string, the parse→compile path runs as before, then stores the compiledRoutinein the cache. On subsequent frames, the cachedRoutineis returned directly viaMap.get(), bypassing the entire pipeline.Why this is safe
The cached routines compile to "call by name" instructions — they resolve the actual function body at runtime via
context.global. When user source code changes,context.global.update(etc.) is updated byRunner.run(), so the cached routine automatically picks up the new function body without cache invalidation.The cache is scoped to
Threadlifetime — new game sessions create a newMicroVM→Runner→Thread→ fresh cache.Benchmark Results (Vitest/Tinybench)
"update()"Files Changed
static/js/languages/microscript/v2/runner.coffee— CoffeeScript sourcestatic/js/languages/microscript/v2/runner.js— compiled JS outputTesting
runner.coffeecompiles cleanly with CoffeeScript.jsfile logicBenchmark Suite (patch included)
A full Vitest/Tinybench benchmark suite is available as a separate patch file (
benchmarks.patch). It includes:benchmark_baseline.md,benchmark_postfix.md)possible_bottlenecks.md)vm.ScriptTo apply:
benchmarks.patch