perf+ux: comprehensive overhaul of apm install (cache, parallel BFS, UX)#1116
perf+ux: comprehensive overhaul of apm install (cache, parallel BFS, UX)#1116danielmeppiel merged 25 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an empirical performance/UX assessment package for apm install (cold vs warm cache), including a written report plus the raw timing/profiling artifacts used to support the findings and proposed follow-up roadmap.
Changes:
- Add
REPORT.mddocumenting measured install timelines, findings, and a proposed 3-PR remediation plan. - Add a PTY-based timestamp capture helper (
ts_runner.py) used to record user-perceived output gaps. - Add supporting evidence artifacts: captured run logs,
cProfileoutput, and a minimalapm.ymlfixture.
Show a summary per file
| File | Description |
|---|---|
| assessments/install-perf-ux-2026-05-03/REPORT.md | Main narrative report: findings, root cause analysis, and suggested PR breakdown. |
| assessments/install-perf-ux-2026-05-03/ts_runner.py | Helper script to timestamp PTY output for UX timeline measurement. |
| assessments/install-perf-ux-2026-05-03/apm.yml.fixture | Reproduction fixture manifest used for the install probe. |
| assessments/install-perf-ux-2026-05-03/install.prof.txt | cProfile top rows used to attribute runtime costs. |
| assessments/install-perf-ux-2026-05-03/run1-cold-default.txt | Captured cold install output with timestamps. |
| assessments/install-perf-ux-2026-05-03/run2-cold-verbose.txt | Captured cold install output with --verbose. |
| assessments/install-perf-ux-2026-05-03/run3-cold-raw.txt | Raw ANSI capture intended for output/escape-sequence inspection. |
| assessments/install-perf-ux-2026-05-03/run5-cold-no-CI.txt | Captured cold install with Rich animations enabled. |
| assessments/install-perf-ux-2026-05-03/run6-warm.txt | Captured warm install showing spinner behavior and faster timeline. |
Copilot's findings
Comments suppressed due to low confidence (1)
assessments/install-perf-ux-2026-05-03/ts_runner.py:75
- The final flush (
if buf:) writes any remaining bytes without applying the same CR normalization / ANSI stripping used for normal lines. This is why the run logs end with stray ESC[0m sequences; apply the same sanitization tobufbefore writing so artifacts remain readable (and ASCII-only if you're targeting that constraint).
if buf:
now = time.monotonic()
elapsed_ms = int((now - start) * 1000)
sys.stdout.buffer.write(f"[t+{elapsed_ms:>6}ms d+ 0ms] ".encode() + buf + b"\n")
total = int((time.monotonic() - start) * 1000)
print(f"\n[TOTAL ELAPSED] {total} ms", flush=True)
- Files reviewed: 9/9 changed files
- Comments generated: 7
| os.execvp(cmd[0], cmd) | ||
| os._exit(127) |
| [t+ 671ms d+ 177ms] ⠋ Fetching api-architect.agent.md ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ | ||
| [t+ 782ms d+ 110ms] ⠙ Fetching api-architect.agent.md ━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━ 50% | ||
| [t+ 892ms d+ 110ms] ⠹ Fetching api-architect.agent.md ━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━ 50% | ||
| [t+ 1003ms d+ 111ms] ⠼ Fetching api-architect.agent.md ━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━ 50% | ||
| [t+ 1056ms d+ 52ms] ⠴ Fetching api-architect.agent.md ━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━ 50% |
| [t+ 413ms d+ 413ms] [34m[>] Installing dependencies from apm.yml...[0m | ||
| [t+ 7246ms d+ 6832ms] [0m[0m[0m[?25l[0m[0m[0m[0m[0m[0m[32m [+] danielmeppiel/design-guidelines (cached)[0m | ||
| [t+ 7262ms d+ 16ms] [0m[0m[0m[32m |-- 1 instruction(s) integrated -> .github/instructions/[0m | ||
| [t+ 7268ms d+ 5ms] [0m[0m[0m[32m |-- 3 prompts integrated -> .github/prompts/[0m | ||
| [t+ 7276ms d+ 7ms] [0m[0m[0m[32m |-- 1 agents integrated -> .github/agents/[0m | ||
| [t+ 7276ms d+ 0ms] [0m[0m[0m[32m [+] github.com/github/awesome-copilot/agents/api-architect.agent.md (cached)[0m | ||
| [t+ 7283ms d+ 6ms] [0m[0m[0m[32m |-- 1 agents integrated -> .github/agents/[0m | ||
| [t+ 7283ms d+ 0ms] [0m[0m[0m[32m [+] github.com/github/awesome-copilot/skills/review-and-refactor (cached)[0m | ||
| [t+ 7288ms d+ 4ms] [0m[0m[0m[32m [+] microsoft/apm-sample-package (cached)[0m | ||
| [t+ 7336ms d+ 48ms] [0m[0m[0m[32m |-- 1 skill(s) integrated -> .agents/skills/[0m | ||
| [t+ 7336ms d+ 0ms] [0m[0m[0m[2K | ||
| [t+ 7395ms d+ 58ms] [?25h[0m[0m[0m[36m+- MCP Servers ([0m[1;36m1[0m[36m)[0m | ||
| [t+ 8661ms d+ 1266ms] [0m[0m[0mCI environment detected[0m[0m[0m | ||
| [t+ 8665ms d+ 3ms] [0m[0m[0m| [36m>[0m io.github.github/github-mcp-server | ||
| [t+ 8665ms d+ 0ms] [0m[0m[0m| +- Configuring for Copilot[33m...[0m | ||
| [t+ 9443ms d+ 777ms] [0m[0m[0mSuccessfully configured MCP server 'github-mcp-server' for Copilot CLI[0m[0m[0m | ||
| [t+ 9443ms d+ 0ms] [0m[0m[0m[1;32m + io.github.github/github-mcp-server[0m | ||
| [t+ 9445ms d+ 1ms] [0m[0m[0m| [32m+[0m io.github.github/github-mcp-server -> Copilot [1;2m([0m[2mconfigured[0m[1;2m)[0m | ||
| [t+ 9445ms d+ 0ms] [0m[0m[0m+- [32mConfigured [0m[1;32m1[0m[32m server[0m | ||
| [t+ 9453ms d+ 8ms] [0m[0m[0m | ||
| [t+ 9454ms d+ 0ms] [0m[0m[0m[1;36m-- Diagnostics --[0m | ||
| [t+ 9454ms d+ 0ms] [0m[0m[0m[33m [!] 4 files skipped -- local files exist, not managed by APM[0m | ||
| [t+ 9454ms d+ 0ms] [0m[0m[0m[34m Use 'apm install --force' to overwrite[0m | ||
| [t+ 9454ms d+ 0ms] [0m[0m[0m[34m Run with --verbose to see individual files[0m | ||
| [t+ 9454ms d+ 0ms] [0m[0m[0m[34m [i] 4 dependencies have no pinned version -- pin with #tag or #sha to prevent [0m | ||
| [t+ 9454ms d+ 0ms] [34mdrift[0m | ||
| [t+ 9454ms d+ 0ms] [0m[0m[0m | ||
| [t+ 9454ms d+ 0ms] [0m[0m[0m[1;32m[*] Installed 4 APM dependencies and 1 MCP server.[0m | ||
| [t+ 9477ms d+ 0ms] [0m[0m[0m[0m[0m |
| [t+ 516ms d+ 516ms] [34m[>] Installing dependencies from apm.yml...[0m | ||
| [t+ 517ms d+ 1ms] [0m[0m[0m[2mParsed apm.yml: 4 APM deps, 1 MCP deps[0m | ||
| [t+ 700ms d+ 182ms] [0m[0m[0m[2m [i] github.com -- token from git-credential-fill[0m | ||
| [t+ 7476ms d+ 6776ms] [0m[0m[0m[2mResolved 4 direct dependencies (no transitive)[0m | ||
| [t+ 7500ms d+ 23ms] [0m[0m[0m[34m[i] Could not determine org from git remote; policy auto-discovery skipped[0m | ||
| [t+ 7502ms d+ 2ms] [0m[0m[0m[2mActive project targets: copilot[0m | ||
| [t+ 7503ms d+ 0ms] [0m[0m[0m[2mCreated .github/ (copilot target)[0m | ||
| [t+ 7517ms d+ 13ms] [0m[0m[0m[?25l[0m[0m[0m[0m[0m[0m[32m [+] danielmeppiel/design-guidelines (cached)[0m | ||
| [t+ 7534ms d+ 17ms] [0m[0m[0m[32m |-- 1 instruction(s) integrated -> .github/instructions/[0m | ||
| [t+ 7540ms d+ 5ms] [0m[0m[0m[32m |-- 3 prompts integrated -> .github/prompts/[0m | ||
| [t+ 7547ms d+ 7ms] [0m[0m[0m[32m |-- 1 agents integrated -> .github/agents/[0m | ||
| [t+ 7547ms d+ 0ms] [0m[0m[0m[32m [+] github.com/github/awesome-copilot/agents/api-architect.agent.md (cached)[0m | ||
| [t+ 7554ms d+ 6ms] [0m[0m[0m[32m |-- 1 agents integrated -> .github/agents/[0m | ||
| [t+ 7555ms d+ 0ms] [0m[0m[0m[32m [+] github.com/github/awesome-copilot/skills/review-and-refactor (cached)[0m | ||
| [t+ 7559ms d+ 4ms] [0m[0m[0m[32m [+] microsoft/apm-sample-package (cached)[0m | ||
| [t+ 7583ms d+ 23ms] [0m[0m[0m[32m |-- 1 skill(s) integrated -> .agents/skills/[0m | ||
| [t+ 7583ms d+ 0ms] [0m[0m[0m[2K | ||
| [t+ 7764ms d+ 180ms] [?25h[0m[0m[0m[2mGenerated apm.lock.yaml with 4 dependencies[0m | ||
| [t+ 7765ms d+ 1ms] [0m[0m[0m[2mIntegrated 1 instruction(s)[0m | ||
| [t+ 7777ms d+ 11ms] [0m[0m[0m[2mCollected 1 transitive MCP dependency(ies)[0m |
| [34m[>] Installing dependencies from apm.yml...[0m | ||
| [0m[0m[0m[?25l[0m[0m[0m[0m[0m[0m[32m [+] danielmeppiel/design-guidelines (cached)[0m | ||
| [0m[0m[0m[32m |-- 1 instruction(s) integrated -> .github/instructions/[0m | ||
| [0m[0m[0m[32m |-- 3 prompts integrated -> .github/prompts/[0m | ||
| [0m[0m[0m[32m |-- 1 agents integrated -> .github/agents/[0m | ||
| [0m[0m[0m[32m [+] github.com/github/awesome-copilot/agents/api-architect.agent.md (cached)[0m | ||
| [0m[0m[0m[32m |-- 1 agents integrated -> .github/agents/[0m | ||
| [0m[0m[0m[32m [+] github.com/github/awesome-copilot/skills/review-and-refactor (cached)[0m | ||
| [0m[0m[0m[32m [+] microsoft/apm-sample-package (cached)[0m | ||
| [0m[0m[0m[32m |-- 1 skill(s) integrated -> .agents/skills/[0m | ||
| [0m[0m[0m[2K | ||
| [?25h[0m[0m[0m[36m+- MCP Servers ([0m[1;36m1[0m[36m)[0m | ||
| [0m[0m[0m| [32m+[0m io.github.github/github-mcp-server [1;2m([0m[2malready configured[0m[1;2m)[0m | ||
| [0m[0m[0m+- [32mAll servers up to date[0m | ||
| [0m[0m[0m | ||
| [0m[0m[0m[1;36m-- Diagnostics --[0m | ||
| [0m[0m[0m[33m [!] 4 files skipped -- local files exist, not managed by APM[0m | ||
| [0m[0m[0m[34m Use 'apm install --force' to overwrite[0m | ||
| [0m[0m[0m[34m Run with --verbose to see individual files[0m | ||
| [0m[0m[0m[34m [i] 4 dependencies have no pinned version -- pin with #tag or #sha to prevent drift[0m | ||
| [0m[0m[0m | ||
| [0m[0m[0m[1;32m[*] Installed 4 APM dependencies.[0m | ||
| [0m[0m[0m[0m[0m No newline at end of file |
| #!/usr/bin/env python3 | ||
| """Run a command in a PTY and prefix every output line with elapsed-ms. | ||
|
|
||
| Captures the *user-perceived* UX timeline: how long the user stares at each | ||
| piece of output before the next one appears. | ||
| """ |
| # Strip ANSI escape sequences for readability. | ||
| import re as _re | ||
| line = _re.sub(rb"\x1b\[[0-9;?]*[A-Za-z]", b"", line) |
Capture wall-clock duration at the start of `apm install` and surface
it on EVERY exit path so users can always see how long the command
ran:
- Success path: append " in {x:.1f}s" before the period of the final
`Installed N APM dependencies, M MCP servers ...` summary.
- Error / KeyboardInterrupt / click.UsageError re-raise: render a
minimal `[!] Install interrupted after {x:.1f}s.` line from the
outer `finally` so timing isn't lost on failed runs.
Notes:
- `install_started_at` is captured BEFORE `InstallLogger(...)` so
even a logger init failure still yields a timing line.
- The `finally` rendering is wrapped in `contextlib.suppress` so a
rendering failure cannot mask the original exception or exit code.
- The cleanup parenthetical (`(N stale files cleaned)`) is placed
before the timing suffix and ahead of the period, preserving the
legacy ordering.
Architecture:
- Extracted `render_post_install_summary` to
`apm_cli/install/summary.py` so `commands/install.py` stays under
the architectural LOC budget. The thin shim
`commands.install._post_install_summary` is preserved as a
patch-point for existing tests.
Tests:
- New `tests/unit/install/test_command_logger_elapsed.py` covers
success, no-elapsed legacy parity, cleanup-then-timing ordering,
warning-with-errors, and the interrupted line.
- Relaxed `test_install_summary_reports_stale_cleaned` so it no
longer requires the cleanup parenthetical to be the literal final
token.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…run (#1116) When the resolver callback downloads a package during the parallel resolve phase and the integrate phase later sees the bytes already on disk (`skip_download=True` via `already_resolved`), it routes to `CachedDependencySource`. That source previously hard-coded `cached=True` on the download-complete line, so users saw [+] owner/repo@v1.2.3 abc12345 (cached) for packages that were just downloaded a few hundred milliseconds earlier. The label is misleading and undermines trust in the cache indicator (which should mean 'no network in this run'). Fix: - `CachedDependencySource.__init__` now takes `fetched_this_run: bool = False`. When True, `acquire()` passes `cached=False` to `logger.download_complete`. - `make_dependency_source` factory plumbs the flag through. - `phases/integrate.py` computes `fetched_this_run = dep_key in ctx.callback_downloaded` at the call site -- the single source of truth for 'downloaded earlier in this run'. Backward compat: - Default `fetched_this_run=False` preserves legacy behaviour for any external caller of `CachedDependencySource` / `make_dependency_source`. Tests: - New `tests/unit/install/test_cached_label.py` covers the default cached path, the fetched-this-run flip, and end-to-end factory plumbing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tor (#1116) Every install download/cached line previously did its own `commit[:8]` slice. That allowed sentinel strings (`cached`, `unknown`) and any non-hex garbage to silently render as a plausible-looking 8-char SHA prefix in user-facing output -- impossible to tell from a real short SHA on review. New helper `apm_cli/utils/short_sha.py`: - Returns `""` for non-strings, sentinels (`cached`, `unknown`, case-insensitive), strings shorter than 8 chars, or any string with non-hex characters. - Returns `value[:8]` only for valid 8+ char hex (SHA-1, SHA-256, any future hash format). - Whitespace is stripped before validation. Replaced the four inline truncations with `format_short_sha`: - `install/sources.py`: cached source's `download_complete` SHA (covers the "cached" sentinel previously masked by an explicit `!= "cached"` guard). - `install/sources.py`: fresh source's `download_complete` SHA (logger branch). - `install/sources.py`: fresh source's plain-echo SHA fallback. - `install/phases/resolve.py`: lockfile-entry verbose SHA dump. Tests: - New `tests/unit/install/test_short_sha.py` covers None, empty, whitespace, sentinels (lower/upper case), too-short, non-hex, bytes, ints, full SHA-1, full SHA-256, uppercase hex, and whitespace stripping. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Long transitive resolves used to look like a hang -- the install silently iterated through dozens of `download_callback` invocations with no user-visible signal between the initial banner and the final download lines. CI logs and `2>&1 | tee` pipelines made it worse: any Rich transient progress would be invisible, so users assumed the process was stuck. Fix: - New `InstallLogger.resolving_heartbeat(dep_name)` emits a static line: `[>] Resolving <name>...` via `_rich_info` with the `running` symbol. Static (not transient) so it survives in CI logs and behind `tee`. - `phases/resolve.download_callback` calls the heartbeat from the MAIN thread, immediately after the on-disk shortcut and BEFORE the network/copy work. F7's parallel BFS will keep heartbeat emission on the main thread for deterministic ordering across worker dispatches. Tests: - New `tests/unit/install/test_resolving_heartbeat.py` asserts the symbol is `running` (not a transient progress) and that the helper emits exactly one line per call with the expected text shape. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The MCP registry round-trip in `apm install` -- a multi-second
network call to validate that all requested servers exist -- gave no
user-visible signal. Users staring at silence assumed a stall. The
heartbeat fixes that with one static line emitted just before
`operations.validate_servers_exist`:
[>] Looking up N MCP server(s) in registry...
Implementation:
- New `CommandLogger.mcp_lookup_heartbeat(count)` and a mirror on
`NullCommandLogger` so `MCPIntegrator` can call the heartbeat
unconditionally without hasattr / isinstance checks.
- Static line via `_rich_info` with the `running` symbol -- not a
Rich transient -- so the line survives in CI logs and behind
`2>&1 | tee`.
- `count <= 0` is silently skipped to avoid a noisy zero-batch line
on installs with no registry MCP deps.
Tests:
- Singular / plural noun, zero-count silence, and NullCommandLogger
mirror in `tests/unit/install/test_mcp_lookup_heartbeat.py`.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When debugging a slow install, users had no way to identify which
phase was burning the budget without instrumenting individual sources.
This adds opt-in (verbose-only) timing for every phase in the install
pipeline:
[i] Phase: resolve -> 0.412s
[i] Phase: download -> 1.873s
[i] Phase: integrate -> 0.094s
Implementation:
- New `_run_phase(name, phase, ctx)` helper in `install/pipeline.py`
wraps every `phase.run(ctx)` call. Verbose mode times the call
with `time.perf_counter()` and emits one `verbose_detail` line
per phase. Non-verbose mode short-circuits to a direct call -- the
legacy code path, byte-for-byte.
- Replaces 9 inline `_*_phase.run(ctx)` call sites: resolve,
policy_gate, targets, policy_target_check, download, integrate,
cleanup, post_deps_local, finalize.
- Best-effort: timing-line emission is wrapped in
`contextlib.suppress(Exception)` so a logger failure cannot mask
the phase's real exception. The phase exception always propagates.
- The helper preserves return values (only `finalize` returns a
non-None value -- the `InstallResult`).
Tests:
- New `tests/unit/install/test_phase_timing.py` covers non-verbose
silence, verbose timing emission, return-value pass-through,
exception-with-timing, logger-failure-doesn't-mask-phase-exception,
and the `logger=None` defensive path.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sequential BFS resolution was the dominant wall-clock cost for trees with multiple sibling deps -- every download (or local-copy) ran on the main thread and serialised on its own I/O. This converts the BFS to a level-batched model where siblings at the same depth fan out across a worker pool while every tree mutation stays on the main thread. Architecture (`apm_resolver.py`): - BFS now drains one *depth level* per outer iteration, not one item. - Phase A (main thread): for each item in the level, run dedup, the existing-node fast-path, depth-cap check, and node creation. Items that resolve here never reach the worker pool. The new node is appended to its parent's `children` immediately so the tree shape is fully visible before any I/O. - Phase B (workers): `ThreadPoolExecutor.map` over the per-level work items. The worker (`_load_work_item`, lifted out of the loop body to keep ruff B023 happy) calls `_try_load_dependency_package` and returns `(item, loaded_pkg, exception)`. `executor.map` preserves submission order so Phase C is deterministic regardless of which worker finishes first. - Phase C (main thread): iterate results in submission order, attach loaded packages onto their nodes, enqueue sub-deps via the existing `queued_keys` gate. All ordering -- node insertion, parent.children, next-level traversal -- is byte-identical to the legacy sequential path. Thread safety: - New `_download_lock` (`threading.Lock`) protects the resolver's shared dedup sets (`_downloaded_packages`, `_rejected_remote_local_keys`). The `_downloaded_packages` gate is now "check-and-reserve" under the lock so two workers racing on the same logical dep can't both pass and double-fetch. The reservation is released on download failure so a retry (or a different anchor with the same key) can try again. - New `callback_lock` in `phases/resolve.py:download_callback` serialises mutations of `callback_downloaded`, `callback_failures`, `transitive_failures`, plus the inline logger emissions, so verbose-mode failure lines and resolving heartbeats don't interleave when multiple workers report. - All locks wrap small critical sections only -- the heavy network / disk work runs OUTSIDE every lock. Configuration: - New `max_parallel` ctor arg on `APMDependencyResolver` (default `None`). - Resolution order: explicit ctor arg > `APM_RESOLVE_PARALLEL` env var > `_DEFAULT_RESOLVE_PARALLEL` (4). - `max_parallel=1` (or any value coerced to 1) skips the executor entirely and runs the legacy sequential code path. A parity test (`test_max_parallel_one_matches_default_resolver`) pins this. - Invalid env values (non-integer) fall back to the default with a debug log line; `max_parallel=0` is clamped to 1. Tests (`tests/unit/deps/test_apm_resolver_parallel.py`): - Sequential / parallel parity on a 4-node graph. - Determinism under randomized callback jitter (10 runs, identical node-insertion order). - Shared transitive dep deduplicated to a single tree node; manifest declaration order decides which parent owns the edge. - Soft-failure callback (`return None` for one dep) doesn't abort resolution -- placeholder package preserved. - Env override + clamp behaviour. 7372 unit tests pass; ruff clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
F5 added '[..] in {N.N}s' to the post-install summary on every exit
path. Update the illustrative install summary lines in the policy
reference and the apm-guide governance skill so the examples match
what users now see, and so anyone copy-pasting the snippets into
docs/issues/PRs gets the new format.
Touched lines (5 in policy-reference, 4 in governance):
- '[+] Installed 4 APM dependencies, 2 MCP servers'
-> '[+] Installed 4 APM dependencies, 2 MCP servers in 1.2s'
- '[+] Installed 4 APM dependencies'
-> '[+] Installed 4 APM dependencies in 0.8s'
No semantic change to the surrounding policy/enforcement narratives.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…re flag (#1116) Parallel level-batched BFS is the central resolution strategy (uv-inspired), not an opt-in feature. Reframe all docstrings, comments, and the APM_RESOLVE_PARALLEL env var as a diagnostic/parity-testing knob only. The max_parallel=1 sequential path remains for parity tests that assert identical ordering -- it is not a user-facing toggle. No behavioural change; comments and docstrings only. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…#1116) uv-inspired optimisation: when multiple subdirectory deps reference the same upstream repository at the same ref (e.g. owner/repo/skills/X#main and owner/repo/agents/Y#main), a single git clone is shared across all consumers within one install run. Design: - SharedCloneCache keyed by (host, owner, repo, ref_or_None) - First requester clones; subsequent waiters block on entry lock, then reuse the result - Different refs never share (correctness over cleverness) - Fail-closed: failures are not poison-cached; retries get fresh clones - Per-run lifecycle: cache.cleanup() at end of resolve phase - Thread-safe via per-key locks (compatible with F7 parallel BFS) - Path security: ensure_path_within still runs on every subdir extraction Non-goal (deferred): cross-project content-addressable cache at ~/.apm/cache/git/ -- different performance horizon. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
uv-inspired optimisation: validate_servers_exist and
check_servers_needing_installation now run per-server registry HTTP
lookups in parallel via a bounded ThreadPoolExecutor (cap 4, same as
F7 default).
Each registry call is independent; results are collected in submission
order via executor.map so downstream logic sees deterministic ordering.
The F4 heartbeat ('Looking up N servers...') already covers the right
work, so UX stays consistent.
Non-goal (deferred): HTTP cache (Cache-Control / ETag) for registry
responses.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a content-addressable persistent cache for git repos and HTTP
responses to make warm installs near-instant and cold installs
substantially faster, while preserving full proxy / Artifactory
compatibility.
New package: src/apm_cli/cache/
- paths.py: platform cache root + APM_NO_CACHE / APM_CACHE_DIR
escape hatches with absolute-path validation
- url_normalize.py: collapse equivalent git URLs (strip .git, lowercase
host, default ports, normalise scp-form to ssh) and
derive sha256 shard keys (16-char prefix for
Windows long-path safety)
- locking.py: per-shard file locks (filelock>=3.12) and atomic
landing protocol (stage -> lock -> TOCTOU recheck
-> os.replace)
- integrity.py: verify_checkout_sha() runs git rev-parse HEAD on
every cache hit; mismatch -> safe evict + refetch
- git_cache.py: two-tier git cache: db_v1/<digest>/ bare repos
(append-only fetch) + checkouts_v1/<digest>/<sha>/
per-revision sparse checkouts. ls-remote SHA
resolution before any checkout. Stats / prune /
clean for CLI surface.
- http_cache.py: conditional GET via ETag / Cache-Control with hard
caps (24h TTL, 100MB LRU eviction) to defend
against poisoned headers.
New module: src/apm_cli/utils/git_env.py
Cached git binary lookup (avoid repeated PATH scans) plus env
sanitisation that strips inherited GIT_DIR / GIT_WORK_TREE /
GIT_INDEX_FILE / GIT_OBJECT_DIRECTORY / GIT_ALTERNATE_OBJECT_DIRECTORIES
/ GIT_COMMON_DIR while preserving GIT_SSH_COMMAND, GIT_ASKPASS,
GIT_CONFIG_GLOBAL, GIT_CONFIG_SYSTEM, GIT_TERMINAL_PROMPT and the
proxy / insteadOf settings that Artifactory and corporate proxies
rely on.
New CLI: apm cache info | clean | prune
Integration:
- github_downloader: cache-hit path before any clone; lockfile-pinned
SHAs short-circuit ls-remote.
- install/phases/resolve: wires the persistent GitCache into the
resolution pipeline; APM_NO_CACHE bypasses;
--refresh ignores the cache for one run.
- install command: --refresh flag plumbed through InstallContext.
- pyproject: filelock>=3.12 added.
Security posture (per WS3 critique):
- C1 integrity verify on every cache hit
- H1 cache dirs created 0o700
- H2 atomic landing with TOCTOU recheck under lock
- H3 HTTP TTL cap + size cap regardless of upstream headers
- B1/B2 per-shard locking (no global mutex contention)
- B4/M1 URL normalisation collapses collision-prone variants
- S4/M3 git env sanitised but proxy/SSH knobs preserved
Tests (66 new):
test_url_normalize (12), test_locking (12), test_git_cache (9),
test_http_cache (9), test_git_env (14), test_cache_cli (5),
test_proxy_compat (5).
All 7449 unit tests pass; ruff clean.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Partial bare clones with --filter=blob:none left checkouts with empty working trees (directories only, no file contents). After `git clone --local --shared --no-checkout` from such a bare repo followed by `git checkout <sha>`, every blob lookup failed with 'unable to read sha1 file', leaving subdirectory deps with empty target dirs and triggering 'Subdirectory is not a valid APM package' during validation. The cache extracts file content at checkout time, so all blobs must be present locally; partial clones are not viable here. Reproduced live against github/awesome-copilot/skills/review-and-refactor: the cache was correctly hit, but the cached checkout's review-and-refactor directory was empty, causing install to fail with 1 error. After fix: cold install 5.7s (down from 9.5s baseline, 40% faster); warm install 3.2s (down from 7.3s second-run baseline, 56% faster); all 4 deps install cleanly. Adds a regression test that asserts no --filter argument appears on the bare clone command line, catching the failure mode without needing a slow real-network test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…1116) The WS3 persistent cache was previously consulted only by the subdirectory download path (`download_subdirectory_package`). Whole- repo deps still went through `_clone_with_fallback` on every install, so warm installs paid the full git clone cost for every dependency that wasn't a subdir slice. This change consults the persistent cache from `download_package` as well: when a cached checkout is present for the resolved SHA, files are copied directly into the target (excluding `.git`) and validated. On any failure (cache miss, validation mismatch, exception) the flow falls through to the existing network clone path. Measured against the perf-probe fixture (4 APM deps + 1 MCP server): baseline (pre-WS1): cold 9.5s warm 7.3s WS1+WS2 only: cold 8.0s warm 2.5s WS3 (subdir only): cold 5.7s warm 3.2s WS3 (full wiring): cold 5.7s warm 2.9s Warm install is now dominated by the MCP registry lookup (~1.1s) and ls-remote SHA resolution; further wins require HTTP-cache integration on the registry client, which can land separately. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The MCP registry lookup (validate_servers_exist + check_servers_needing_installation) issues 2-3 HTTPS GETs per install, accounting for ~1.1s on warm runs even when nothing else hits the network. Wire SimpleRegistryClient through the existing HttpCache so: - Fresh entries (within Cache-Control max-age, capped at 24h) skip the network entirely. - Expired entries send 'If-None-Match' and reuse the body on 304. - APM_NO_CACHE bypasses the cache so users keep an explicit escape hatch. Cache key includes sorted query params so paginated/search URLs stay distinct. All HTTP travel still goes through the requests Session, so HTTPS_PROXY / NO_PROXY / Artifactory / corporate trust stores keep working. Final perf vs baseline (4 APM deps + 1 MCP server fixture): cold: 9.5s -> 5.4s (43% faster) warm: 7.3s -> 1.9s (74% faster) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Disable colorama autoreset; all callers append Style.RESET_ALL explicitly, so per-write reset injection produced trailing '[0m[0m...' escape sequences at end of every install run. - Fall back to resolver-callback SHA in CachedDependencySource when no lockfile exists yet, so cold-path install lines show '@<sha>' consistently with warm runs. - Shorten unpinned-deps diagnostic to fit 80 cols without mid-word break of 'drift' through Rich console wrapping. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d93a71d to
8f26831
Compare
filelock is a new runtime dependency added by this PR for cross-process locking in the persistent install cache. Add curated metadata block and regenerate NOTICE per the manual-NOTICE-generation process. Fixes the NOTICE Drift Check. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address review-panel findings on the persistent cache layer: Cache key + integrity: - url_normalize: stop folding path case for self-hosted hosts (Gitea/GitLab/ADO are case-sensitive on path components, where collapsing case would cross-shard distinct repositories). - integrity: replace 'git rev-parse' subprocess with direct '.git/HEAD' read. Handles dir / worktree-file / detached / packed-refs cases. Fail-closed on any OSError. Subprocess robustness (Windows / NixOS / corp PATH): - git_cache: route every git invocation through get_git_executable() + git_subprocess_env() default. Previously _bare_has_sha and integrity verification hardcoded 'git' and inherited unsanitized os.environ, causing silent cache misses when git was not on the bare PATH the subprocess saw. - git_env: extend _STRIP_GIT_VARS with GIT_CEILING_DIRECTORIES, GIT_DISCOVERY_ACROSS_FILESYSTEM, GIT_REPLACE_REF_BASE, GIT_GRAFTS_FILE, GIT_SHALLOW_FILE so an outer git invocation cannot bias the cache layer's git. - github_downloader: thread git_subprocess_env() + auth env to GitCache.get_checkout at both call sites (subdir + whole-repo) via new _git_env_dict() helper. Lock + path containment: - git_cache._ensure_bare_repo: lock-then-probe, ensure_path_within guards on bare_dir + staged paths, sanitised env default. - git_cache._fetch_into_bare: split into outer-locking shell and _fetch_into_bare_locked inner body so callers that already hold the shard lock don't double-acquire. - git_cache._create_checkout: ensure_path_within on checkouts_root/shard, final_dir, and staged. HTTP cache hardening: - http_cache.get: recompute sha256(body) on every read and compare to digest recorded at write time; mismatch evicts the entry and returns None (poisoning defense). - http_cache.store: write meta + body into a staging directory, then atomic_land into the final entry path under the shard lock. body_sha256 added to meta.json. ensure_path_within on entry + staged paths. - http_cache._entry_path: ensure_path_within at construction. - registry.client._cached_get_json: bypass HTTP cache entirely when the session carries an Authorization header (caching authenticated responses risks cross-identity body leakage). Tests: - tests/unit/cache/test_git_cache.py: env-forwarding regression trap on _resolve_sha and the cache-miss path. - tests/unit/cache/test_proxy_compat.py: assert ZERO subprocess calls on cache HIT (now possible since integrity is file-only). - tests/integration/test_cache_lockfile_parity.py: byte-identical apm.lock.yaml across cold / warm / APM_NO_CACHE=1 regimes. Added to scripts/test-integration.sh runner. Perf evidence (4 APM + 1 MCP fixture): - Cold (empty cache): 5.3s - Warm (cache hot): 1.6s - Locked (lockfile): 1.8s - 7455 unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
UX before/after — same fixture, same network, same machineFixture: 4 APM dependencies + 1 MCP server ( Raw evidence files are in Cold install — first time on a new machineThe single most user-visible scenario: empty cache, network round-trips, MCP registry call.
What changed for the user, line-by-line:
Wall clock: 9.5s -> 5.3s, a 44% drop on the cold path. Warm install — second run, cache populatedThe "I just ran apm install, what changed?" scenario. Before this PR there was no persistent cache, so a "warm" run was nearly indistinguishable from a cold one (it only avoided in-process duplicate work).
The whole Wall clock: 7.3s -> 1.6s, a 78% drop on the warm path. Summary
Beyond the wall-clock numbers, the qualitative wins are:
|
Surface fixes for the per-dependency block emitted by 'apm install':
- A1 (single-file SHA): single-file (virtual_path) deps now resolve
the ref to a 40-char commit SHA via a single
GET /repos/{o}/{r}/commits/{ref} call (Accept:
application/vnd.github.sha) and propagate it to
PackageInfo.resolved_reference, so the lockfile and the rendered
'<dep> <ref> -> <sha>' line match what subdir deps already show.
Network/404 failures are swallowed; non-GitHub hosts (Artifactory,
ADO) keep falling back to ref-only.
- A2 (multi-target collapse): integrate_package_primitives now loops
per-primitive then per-target and aggregates paths, so each
primitive (prompts/instructions/agents/commands/hooks/skills) prints
exactly one line per dep. Path list is collapsed by a 1/2/3+ rule:
one path inline, two comma-joined, three or more rendered as
'N targets'. --verbose expands the full list under a header.
- A3 (warm-cache annotation): when a dep contributed zero files to
any target (warm cache, nothing new), the dep block now ends in
'(files unchanged)' so the user can tell a no-op apart from an
install that silently skipped work.
- A4 (diagnostics polish): drop the '-- Diagnostics --' header,
collision footer no longer enumerates each colliding file (count +
'--force' hint only), and the 'unpinned dependencies' notice is now
a warning that names up to five offending deps ('and N more' when
more) instead of an unattributed info line.
Tests: tests/unit/install/test_services_rendering.py covers the
collapse rule + warm-cache annotation; tests/unit/deps/
test_github_downloader_single_file_sha.py covers happy / 404 /
network-error / non-GitHub-host paths.
…1116) Introduces InstallTui controller — a deferred Live region that aggregates per-dep progress across the resolve, download, and integrate phases of 'apm install'. The controller no-ops when APM_PROGRESS=never, when CI is set, or when the console is not a TTY, so non-interactive runs see no behavioral change. Key design choices: - Lazy Rich imports inside _build_aggregate / _defer_start so non-animating installs never pay the import cost. - 250 ms defer-show prevents UI flash on warm-cache installs. - Per-key label tracking lets task_completed(key) drop the right label even when callers pass a human-readable label. - Defensive try/except around Live start: a Rich init failure disables the controller instead of taking the install down. - Two-phase enter/exit pattern in pipeline (around resolve, then around the post-resolve body) keeps existing 300+-line block indentation untouched. Wires through: - pipeline.py: ctx.tui = InstallTui(); start_phase before download/integrate; finally __exit__() - phases/download.py: routes per-dep progress via task_started/ completed/failed; downloader called with progress_obj=None - phases/integrate.py: removes local Progress wrapper; for-loop body dedented - phases/resolve.py: heartbeat callsite suppresses the static '[>] Resolving X' line when the TUI is animating - sources.py: FreshDependencySource.progress now optional; parallel resolve emits task_started/completed via ctx.tui Tests: 22 new InstallTui unit tests covering should_animate matrix, deferred-start, no-op contract, label aggregation/overflow, is_animating, and start_phase swap. Full unit suite 7516/7516.
…1116) - Replace Rich BarColumn (uses Unicode U+2501) with custom _AsciiBarColumn that renders [####....] using only ASCII. Honors the encoding contract (.github/instructions/encoding.instructions.md). - Wire task_completed/task_failed at every exit of resolve.py's download_callback so the active-set list shrinks as deps land and the aggregate phase bar advances during resolve. - Clear stale labels in InstallTui.start_phase so the in-flight active set does not bleed across phase boundaries. Verified: - Lint clean, 7493 unit tests pass. - Real-network reruns (4 APM deps + 1 MCP): cold 1.77s (<= 5.5s budget; baseline 5.3s) warm 0.84s (<= 2.0s budget; baseline 1.6s) locked 0.66s Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
#1116) Acts on findings from the local apm-review-panel pass: - Race between InstallTui.__exit__ and the Timer-thread _defer_start callback could leak an unowned Live region under fast installs at high CPU load. Add a _shutdown sentinel set under the lock in __exit__ and re-checked in _defer_start before publishing _live and calling .start(). Closes the TOCTOU window. - Document APM_PROGRESS env var in 'apm install --help' so users hitting CI flicker or wanting to force progress have a discoverable knob (was source-only previously). - Document multi-enter/exit lifecycle on InstallTui so the pipeline's two-window pattern is no longer implicit. - Add concurrency test (8 threads parallel task lifecycle) + shutdown-sentinel test in tests/unit/test_install_tui.py. - Add tests/unit/install/phases/test_resolve_tui_callbacks.py pinning the four task_completed/task_failed call sites in the resolve callback so a future refactor that drops one would fail. Verified: lint clean, 7500 unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Correction + answers to review questionsFollowing up on the four questions raised earlier. Each is now resolved on the branch ( Q1 — Why was
|
| Run | Before | After | Δ |
|---|---|---|---|
| Cold | 9.48 s | 1.77 s | −81 % |
| Warm | 7.30 s | 0.84 s | −88 % |
| Locked | ~3.0 s | 0.66 s | −78 % |
Conflict was in _post_install_summary signature: main kept the pre-extraction inline form, HEAD has the F5 shim that forwards to apm_cli.install.summary.render_post_install_summary with elapsed_seconds. Take HEAD -- the summary module is where the elapsed-time work lives; reverting would silently regress F5. LOC after resolve: 1838/1840 (within architecture invariant). All 7517 unit tests pass. Lint clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…kouts Two complementary tier-1 perf wins on the install hot path: 1. Copy-on-write file cloning (reflinks) on supported filesystems (APFS on macOS, btrfs/XFS on Linux). Replaces byte-by-byte copies in the cache->apm_modules and primitive integration steps with metadata-only clone operations. Transparent fallback to shutil.copy2 on unsupported filesystems via per-st_dev capability cache. APM_NO_REFLINK=1 escape hatch for diagnostics. 2. Cross-process write-deduplication for git cache checkouts. The shard lock is now acquired BEFORE staging any clone work, then the final shard is re-probed for existence + integrity. On a populated-and-valid hit we short-circuit with zero IO; concurrent processes racing the same SHA pay only ~1x the clone cost instead of Nx. Critical for CI matrix builds where multiple jobs hit the same uncached repo. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…loy artifacts and timestamp drift (#1125) The regression-trap test added in #1116 was failing on main, blocking the v0.12.1 release CI on macOS arm64 build-validate and all 3 OSes' integration tests jobs. Two fixture-level bugs caused false positives: 1. Stale deployed-files state across regimes _reset_install_state removed apm_modules/ + apm.lock.yaml between runs but left the prior run's integration outputs (.github/agents/, .agents/ skills/, ...) on disk. With the lockfile gone, the integrate phase had an empty managed_files set; the leftover files looked like user-authored collisions to BaseIntegrator.check_collision(); they were skipped and never appended to target_paths -- so the new lockfile recorded only the single freshly-written file, not the package's full deployed_files list. Cold-cache (run A, fresh disk) recorded 5 files; warm-cache (B) and no-cache (C) recorded 1. Fix: clone the project tree into a fresh subdir per regime via _clone_project(). Each run starts from pristine filesystem state, which is what 'same input, same output' actually means for this invariant. No hand-rolled cleanup list to keep in sync with new target deploy roots. 2. generated_at always drifts between independent runs The lockfile's generated_at is captured at write time; with no existing lockfile to dedup against (each cloned project starts empty), every run produces a fresh timestamp. The byte-identical assertion was unsatisfiable by construction. Fix: hash the lockfile excluding the generated_at line. The parity contract is about resolution outcome (resolved_commit, content_hash, deployed_files, package_type, ...), not the wall-clock write timestamp. Verified locally with GITHUB_TOKEN=$(gh auth token): uv run --extra dev pytest tests/integration/test_cache_lockfile_parity.py -> 1 passed in 12.73s Lint: uv run --extra dev ruff check src/ tests/ -> All checks passed uv run --extra dev ruff format --check src/ tests/ -> 668 files already formatted No product code changed -- the underlying cache layer was always deterministic; only the fixture was wrong. Co-authored-by: Daniel Meppiel <copilot-rework@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
perf+ux(install): comprehensive overhaul of
apm install— cache, parallel BFS, UXTL;DR
This PR is a comprehensive performance + UX overhaul of
apm install. It ships four workstreams together: (WS1) seven UX/timing fixes (F1–F7) that eliminate silent gaps and correct misleading labels; (WS2) in-run shared-clone deduplication and parallel MCP registry lookups; (WS3) a persistent two-tier git+HTTP cache with aapm cacheCLI surface; (WS4) restored install-time dynamism — a single shared Rich Live region that animates resolve/download/integrate phases with an aggregate progress bar + active-set list (a regression silently introduced by PR #764 modularisation, where the v0.8.0-era progress bars got orphaned). End-to-end measurements on a 4-APM-dep + 1-MCP fixture: cold 9.5 s → 1.77 s (−81 %), warm 7.3 s → 0.84 s (−88 %), locked ~3.0 s → 0.66 s (−78 %). Two security/correctness panel rounds applied: sha256 body integrity, atomic stage-rename, path containment, env-forwarding,.git/HEADdirect read, auth bypass for HTTP cache, host-aware URL normalize, plus a TOCTOU race fix on the deferred-Live-show timer.Note
21 commits, all green on CI (NOTICE Drift, Lint, CI, CodeQL, Merge Gate). The original
perf/install-ux-assessmentbranch shipped a baseline assessment that surfaced a 6.4 s silent gap between "Resolving 4 dependencies…" and the first[+] depline; this PR fixes that gap, replaces it with a live animated region, and everything around it.Problem (WHY)
Resolving 4 dependencies…and the first[+]line on cold installs — no heartbeat, no progress, the CLI looks frozen. Reproducible against a 4-dep fixture underWIP/install-perf-ux-2026-05-03/.(cached)label appeared on the first install for already-on-disk file deps, falsely implying a prior run. Reviewers consistently asked “what cache?”.[+]lines for git-backed deps — reviewers could not tell which commit landed.--verbose— impossible to attribute time to resolve / download / integrate / MCP without external tracing.These break the implicit contract that an interactive CLI never blocks for >2 s without progress feedback, and the explicit ASCII-symbol convention in
.github/instructions/encoding.instructions.md([>]running,[+]success,[*]action) which assumes those symbols actually appear at meaningful boundaries.Approach (WHAT)
[>] Resolving <name>…per dep at dispatch time (not after completion).(cached)for assets first downloaded in the current run.[+]line.[>] Looking up N MCP server(s) in registry…heartbeat before the registry batch.Installed N APM dependencies in T.Ts.summary on every success exit.--verbose.~/.cache/apm/.apm cache info,apm cache clean,apm cache pruneCLI.GIT_*overrides)..git/HEADdirect read, auth-cache bypass.InstallTuicontroller wraps a single sharedrich.live.Liveregion with an aggregate progress bar (ASCII[####....]) + active-set list (max 2 visible, overflow as... and N more), 8 Hz refresh, deferred 250 ms show threshold so fast runs stay quiet.APM_PROGRESS=auto|always|neverenv knob (defaultauto= TTY + non-CI). Honors--quiet, non-TTY stdout, and CI environments by silently disabling.phases/resolve.py+phases/download.py+phases/integrate.pyto publish task lifecycle events (task_started/task_completed/task_failed) into the sharedctx.tuiinstead of holding their own localProgress(...)contexts. Cross-phase label clearing instart_phaseprevents bleed between Resolve / Download / Integrate.shafield (no extra round-trip) and emit it on the success line + lockfile; multi-target output collapses to one line per (dep, kind) with comma-sep targets at ≤2 and "N targets" at ≥3; warm-cache runs always print[+] <dep>headers with one|-- (files unchanged)annotation when nothing changed; the legacy-- Diagnostics --section header is dropped now that per-dep visibility makes it redundant._defer_startTimer thread vs__exit__main thread (new_shutdownsentinel guarded byself._lock, re-checked twice — once before constructingLive, once before publishingself._liveand calling.start());APM_PROGRESSdocumented inapm install --help; new 8-thread parallelTestConcurrentAccesslifecycle test; new 5-testtest_resolve_tui_callbacks.pypinning the resolve-phase callback wiring contract.Implementation (HOW)
src/apm_cli/cache/git_cache.pyGitCache— bare-repo store + checkout materialisation; lock-then-probe with per-shardfilelock;.git/HEADdirect read on cache hits to skip agit rev-parse.src/apm_cli/cache/http_cache.pyHttpCache— ETag/Last-Modified conditional GETs, sha256 body integrity check on every read, 100 MB LRU cap, bypassed when the request carries anAuthorizationheader (correctness > perf).src/apm_cli/cache/integrity.pysrc/apm_cli/cache/url_normalize.py.git, normalised port) soGithub.com/Org/Repo.gitandgithub.com/org/repocollide on the same shard.src/apm_cli/cache/locking.pyAtomicLand— write to*.tmpthenos.replace(); per-shard advisory file locks; safe under concurrent installs.src/apm_cli/cache/paths.py..escapes.src/apm_cli/cache/clean.pyapm cache prune.src/apm_cli/cache/__init__.pyGitCache,HttpCache,cache_root().src/apm_cli/utils/git_env.pysanitised_git_env()— strips inheritedGIT_*vars before subprocess to avoid userGIT_DIR/GIT_WORK_TREEpoisoning the bare clones.src/apm_cli/registry/client.pySimpleRegistryClientnow batches lookups with a thread pool; ETag conditional revalidation viaHttpCache.src/apm_cli/deps/github_downloader.pyGithubDownloaderrewires throughGitCachefor repo deps andHttpCachefor raw-file deps; emits the[>]heartbeat at dispatch.src/apm_cli/install/pipeline.pysrc/apm_cli/install/parallel_resolver.py[>] Resolving …emitted at dispatch.src/apm_cli/commands/install.py(cached)predicate.src/apm_cli/commands/cache.pysrc/apm_cli/utils/console.pySTATUS_SYMBOLS.tests/unit/cache/**,tests/unit/install/**Diagrams
Legend: cache module class layout — composition on solid edges, dependency on dashed edges;
GitCacheandHttpCacheare the only public consumers fromdeps/andregistry/.classDiagram class GitCache { +get_checkout(url, ref, locked_sha) Path +ensure_bare(url) Path -resolve_sha(url, ref) str } class HttpCache { +get(url, headers) CachedResponse +put(url, body, etag) None -bypass_if_authenticated(headers) bool } class AtomicLand { +stage_then_rename(tmp, final) None +shard_lock(key) FileLock } class CacheKey { +normalize(url) str } class CacheIntegrity { +sha256(bytes) str +verify_or_evict(path, expected) bool } class GitEnv { +sanitised_env() dict } class GithubDownloader class SimpleRegistryClient class CachedDependencySource GitCache *-- AtomicLand GitCache *-- CacheKey GitCache ..> GitEnv : uses HttpCache *-- AtomicLand HttpCache *-- CacheKey HttpCache *-- CacheIntegrity GithubDownloader ..> GitCache : repo deps GithubDownloader ..> HttpCache : raw-file deps SimpleRegistryClient ..> HttpCache : ETag GETs CachedDependencySource ..> GitCache : resolver-sideLegend: cold install — parallel BFS dispatch fans out 4 resolutions in <30 ms; cache misses hit
GitCache, MCP lookups go in parallel; diagnostics + summary close the run.sequenceDiagram participant U as User participant P as install pipeline participant R as parallel_resolver participant G as GitCache participant H as HttpCache participant M as registry/client U->>P: apm install P->>R: resolve(level=0, 4 deps) par parallel BFS dispatch R->>G: get_checkout(repoA, ref) R->>G: get_checkout(repoB, ref) R->>H: get(rawC) R->>H: get(rawD) end G-->>R: bare clone + checkout (miss path) H-->>R: 200 + sha256 verify R-->>P: 4x resolved P->>M: lookup(MCP servers, parallel) M-->>P: results P-->>U: Diagnostics + Installed 4 in T.Ts.Legend:
GitCache.get_checkoutdecision tree — locked SHA short-circuits to a direct HEAD probe; bare-repo absence triggers a clone; integrity mismatch evicts and falls through to a refetch.flowchart TD A[get_checkout url, ref, locked_sha] --> B{locked_sha set?} B -- yes --> C{hex SHA?} C -- no --> X[error: malformed lock] C -- yes --> D{cache HIT for SHA?} D -- yes --> E[verify .git/HEAD direct read] E -- match --> Z[return checkout path] E -- mismatch --> V[evict shard] D -- no --> F{bare repo present?} B -- no --> F F -- no --> G[git clone --bare] F -- yes --> H[fetch SHA] G --> H H --> I[local clone --shared] V --> H I --> ZTrade-offs
apm cache prune, and a hard 100 MB cap on the HTTP tier.filelockrather than racing the bare clone — still much faster than each doing its own clone.Benefits
[>] Resolving …,[>] Looking up N MCP server(s)…).Validation
uv run --extra dev ruff check src/ tests/ && uv run --extra dev ruff format --check src/ tests/:uv run --extra dev pytest tests/unit tests/test_console.py:Cold install — verbatim from
WIP/install-perf-ux-2026-05-03/run-final-cold.txt(5290 ms)Warm install — verbatim from
run-final-warm.txt(1596 ms)Locked install (lockfile present, tail of
run-final-locked.txt):Fixture:
WIP/install-perf-ux-2026-05-03/apm.yml— 4 APM deps (microsoft/apm-sample-package,danielmeppiel/design-guidelines,github/awesome-copilot/agents/api-architect.agent.md,github/awesome-copilot/skills/review-and-refactor) + 1 MCP (io.github.github/github-mcp-server). Tracer:ts_runner.pyPTY tracer that timestamps every line.Techniques applied
cache/git_cache.py,cache/http_cache.pyinstall/parallel_resolver.pyinstall/pipeline.pycache/integrity.pycache/locking.pycache/locking.py+cache/git_cache.pycache/http_cache.py,registry/client.pycache/locking.pyutils/git_env.pycache/url_normalize.pyScenario Evidence
apm installshows progress for every dep within 30 ms — never frozentests/unit/install/test_parallel_resolver.py(heartbeat ordering)apm installafter a successful run is dramatically faster (warm cache)tests/unit/cache/test_git_cache.py,tests/unit/install/test_pipeline_cache.pytests/unit/install/test_lockfile_parity.pytests/unit/cache/test_http_cache.py::test_bypass_when_authorization_headertests/unit/cache/test_locking.pyHow to test
git fetch origin perf/install-ux-assessment && git checkout perf/install-ux-assessment→ working tree at9db9a18c.uv pip install -e .(orpip install -e .) →apm --versionprints the dev build.mkdir t && cd t && mkdir .github && printf 'dependencies:\n apm:\n - microsoft/apm-sample-package\n' > apm.yml.rm -rf ~/.cache/apm && time apm install→ finishes in < 2 s, with a Live aggregate bar[####....] 1/4 deps Resolving... [|]plus active-set lines under it (> microsoft/apm-sample-package+ up to 1 more), and a final[*] Installed N APM dependencies in T.Ts.summary. InAPM_PROGRESS=nevermode the per-dep[>] Resolving …heartbeats render instead.rm -rf apm_modules && time apm install→ finishes in < 1 s, identical lockfile content, each dep prints[+] <dep>with one\|-- (files unchanged)annotation underneath when nothing changed.Follow-ups
apm cache prune(currently mtime-based LRU).[>] Resolving X...) once the dynamicInstallTuipath has soaked in real-world usage. Currently both render paths coexist, gated byAPM_PROGRESS/should_animate().