Conversation
Add `--moi-cache <dir>` flag to cache pre-compiled dependency scopes to disk, enabling incremental type-checking in `--check` mode. When a dependency's source hash and transitive dependency fingerprints match the cached entry, the type-checker loads the cached Scope.t directly instead of re-parsing and re-checking the file. Implementation: - Custom binary serialization for Scope.t and Type.t, handling cyclic type constructors via a create-then-fixup pattern - Merkle fingerprinting (SHA-256 of source hash + sorted dep fingerprints) for transitive cache invalidation - Compiler version (Source_id.id) embedded in cache headers to reject stale caches from different compiler builds - Atomic file writes (write to .tmp, rename) for crash safety - Mixin libraries correctly skipped (not cacheable) Currently restricted to `--check` mode. Compile-mode IR caching is planned as a follow-up (see .cursor/plans/). Made-with: Cursor
Add detailed plan for caching post-IR-pass library decs to skip parsing, type-checking, lowering, and all 7 IR passes for unchanged dependencies during compilation. Expected ~74% faster compiles. Made-with: Cursor
Cache post-IR-pass library declarations to skip lowering, IR passes, and partial codegen work for unchanged dependencies. On warm cache, only the main program is lowered and passed through IR transforms, then linked with cached library IR before codegen. Benchmarked on motoko-core Map.test.mo: - Baseline: 1.56s → Warm cache: 0.68s (2.3x speedup) Key changes: - ir_cache.ml: Marshal-based serialization of Ir.dec list + id_stamps with binary hash validation and atomic writes - pipeline.ml: split compile path into cached/uncached, with compile_combined_prog handling link + codegen - const.ml: accept known_const parameter for fragment analysis - cons.ml: bump_stamps_past to prevent stamp collisions after deser - construct.ml: get/set_id_stamps for fresh name counter continuity Made-with: Cursor
No, it was a quick POC measuring the speedup potential by caching the typing env and the IR. |
|
Something is a bit fishy since I don't think we currently run IR-passes on library code at all - we just do them on the combined IR. Won't this produce more and redundant code, like multiple versions of the same |
Yes it would. I was just experimenting with this idea first trying to measure how faster compilation would get if all lib-IR was cached. |
Experiment: Incremental Compilation
Experimental incremental compilation support for
moc. Two caching strategies to reduce compile times when dependencies (packages likemo:base,mo:core) stay the same. Both use the--moi-cache <dir>flag.Experiment 1: Scope caching (
.moi) —--checkmodeGoal: Speed up repeated type-checking by caching each library's type-checked scope.
Approach: Serialize
Scope.tper library to.moifiles using a custom binary format, handling cyclic type constructors. Cache key is a Merkle hash of the library's source content and its transitive dependencies. On cache hit, type-checking for that library is skipped entirely.Results (Map.test.mo, motoko-core,
$(mops sources), 78 dependency files):--check(no cache)--check+.moicache (cold)--check+.moicache (warm)The scope cache saves ~58ms in check mode (~20% of check work).
Conclusion: Modest speedup. Type-checking is already fast, so caching it saves little in absolute terms.
Profiling: where does compile time go?
This experiment also revealed the full compile-time breakdown, motivating experiment 2:
-c)The seven whole-program IR passes (
erase_typ_field,show,eq,await,async,tailcall,const) dominate compile time at 73%.Experiment 2: IR caching (
.moic) —-ccompile modeGoal: Speed up repeated compilation by caching post-IR-pass library declarations.
Approach: After lowering and running all seven IR passes on library code, serialize the resulting
Ir.dec listto.moicfiles usingMarshal. On warm cache, the compiler skips lowering and all IR passes for unchanged libraries — only the main program goes through the full pipeline, then gets linked with cached library IR before codegen.Results (Map.test.mo, motoko-core,
$(mops sources)):-c)Cold and warm produce byte-identical Wasm. Cache invalidation works correctly.
Where the remaining time goes (warm, 0.68s):
Codegen is now the bottleneck — it still walks all declarations (cached + main). Eliminating this would require separate Wasm compilation + linking, which is a fundamentally larger change.
Overall conclusion
~350 lines of new code for a 2.3x compile-mode speedup. The IR caching approach works but hits diminishing returns: the next meaningful improvement requires Wasm-level separate compilation, not more caching. This POC uses
Marshal(tied to exact compiler binary) and has no cache eviction — not production-ready as-is.Test plan
test/run/moi-cache.mo— roundtrip: counter, types with generics, recursive tree typestest/fail/moi-cache-error.mo— type errors still reported correctly with cache