Skip to content

perf(react): optimize git tree materialization for diff#109

Merged
async3619 merged 4 commits intodevfrom
async3619/debug-slow-dev
Mar 28, 2026
Merged

perf(react): optimize git tree materialization for diff#109
async3619 merged 4 commits intodevfrom
async3619/debug-slow-dev

Conversation

@async3619
Copy link
Copy Markdown
Owner

Summary

  • Replace per-file git cat-file with a single git archive --format=tar | tar -x pipeline for snapshot materialization (~60x faster for large repos)
  • Add CoW (copy-on-write) clone strategy: materialize the after tree once, then APFS cp -c clone it and overlay only changed files for the before tree
  • Use fire-and-forget rm -rf via detached child processes instead of blocking fs.rmSync for snapshot cleanup

Related Issue

Closes #

Validation

  • pnpm run check
  • Commits follow Conventional Commits
  • Branch name follows codex/issue-<number>-<slug>
  • This PR will be Squash Merged into dev

Notes

  • Tested on a real project (~2min → ~1.7s for --diff HEAD~3...HEAD)
  • CoW clone falls back to full git archive on non-APFS/non-reflink filesystems

Replace per-file git cat-file with git archive|tar pipeline, add CoW
clone strategy for the second snapshot, and use async cleanup via
detached rm processes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Optimize git tree materialization for diff analysis with CoW cloning

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Replace per-file git cat-file with git archive | tar pipeline for ~60x faster snapshot
  materialization
• Add copy-on-write clone strategy to materialize after tree once, then overlay only changed files
• Use async fire-and-forget rm -rf via detached child processes instead of blocking cleanup
• Consolidate graph loading logic into unified loadBothGraphs function with optimized snapshot
  handling
Diagram
flowchart LR
  A["Git Trees<br/>beforeTree, afterTree"] -->|"git archive<br/>pipeline"| B["Materialize<br/>afterSnap"]
  B -->|"CoW clone<br/>cp -c/-a"| C["Clone to<br/>beforeSnap"]
  A -->|"git diff<br/>--name-status"| D["Identify<br/>Changed Files"]
  D -->|"Overlay<br/>modifications"| C
  C -->|"Analyze"| E["React Graphs"]
  B -->|"Analyze"| E
  E -->|"async rm -rf<br/>detached"| F["Cleanup<br/>Snapshots"]
Loading

Grey Divider

File Changes

1. src/analyzers/react/diff.ts Performance optimization +131/-47

Optimize git snapshot materialization with CoW cloning

• Replaced per-file git cat-file calls with single git archive | tar pipeline for snapshot
 materialization
• Added materializeTwoGitTreeSnapshots function implementing copy-on-write clone strategy with
 fallback
• Implemented cloneSnapshotWithOverlay function using platform-specific CoW cloning (cp -cR on
 macOS, cp -a --reflink=auto on Linux) and selective file restoration
• Refactored graph loading into unified loadBothGraphs function that handles both working tree and
 git tree comparisons
• Changed snapshot cleanup from blocking fs.rmSync to async fire-and-forget `spawn('rm', ['-rf',
 ...])` with detached processes
• Updated imports to include execSync and spawn from node:child_process

src/analyzers/react/diff.ts


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review bot commented Mar 28, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (0) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider


Action required

1. Spawned rm can crash🐞 Bug ⛯ Reliability
Description
loadBothGraphs() spawns rm -rf without an error handler; if the process cannot be spawned, Node
emits an unhandled error event which can terminate the CLI during cleanup. This also makes
snapshot cleanup depend on an external rm binary being available on PATH.
Code

src/analyzers/react/diff.ts[R185-187]

+    for (const dir of snapshotsToCleanup) {
+      spawn('rm', ['-rf', dir], { stdio: 'ignore', detached: true }).unref()
+    }
Evidence
The cleanup loop uses spawn('rm', ...) and immediately .unref()s it, but never attaches an
error listener; spawn failures surface via the ChildProcess error event which is fatal when
unhandled.

src/analyzers/react/diff.ts[154-188]
Best Practice: Node.js child_process / EventEmitter behavior

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Cleanup uses `spawn('rm', ['-rf', dir])` without handling ChildProcess errors. If `rm` cannot be spawned, the process can crash due to an unhandled `error` event.
## Issue Context
This runs in a `finally` block, so it triggers even when the analysis succeeds; a cleanup crash would be particularly confusing.
## Fix Focus Areas
- Replace the `spawn('rm', ...)` cleanup with a cross-platform Node cleanup that cannot crash the process (e.g., `fs.rm(dir, { recursive: true, force: true }, () => {})` or `void fs.promises.rm(...).catch(() => {})`).
- If you keep `spawn`, attach `child.on('error', () => {})` at minimum.
### Focus areas in code
- src/analyzers/react/diff.ts[154-188]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Unix tools required for diff 🐞 Bug ⛯ Reliability
Description
Snapshot materialization and overlay now depend on external tar and cp commands (and a shell
pipe for git archive | tar -x), which will fail in environments where these tools/shell semantics
are unavailable. This can break foresthouse react --diff entirely even though the rest of the
analyzer only shells out to git.
Code

src/analyzers/react/diff.ts[R1008-1014]

+  execSync(
+    `git -C ${JSON.stringify(repositoryRoot)} archive --format=tar ${tree} | tar -x -C ${JSON.stringify(snapshotRoot)}`,
+    {
+      maxBuffer: GIT_EXEC_MAX_BUFFER,
+      stdio: ['ignore', 'pipe', 'pipe'],
+    },
+  )
Evidence
materializeGitTreeSnapshot invokes a shell pipeline with tar -x, and cloneSnapshotWithOverlay
also shells out to cp and tar; these are new hard runtime dependencies not otherwise used in the
codebase.

src/analyzers/react/diff.ts[1000-1016]
src/analyzers/react/diff.ts[1043-1113]
README.md[8-64]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new snapshot approach requires external `tar` and `cp` commands and a shell pipeline. In environments where these commands are missing (or where shell piping is not available/behaves differently), `--diff` will fail.
## Issue Context
The rest of the analyzer uses `execFileSync('git', ...)` and Node filesystem APIs; this change introduces new runtime dependencies and a shell string command.
## Fix Focus Areas
- Avoid `execSync(<string with pipe>)`; prefer `execFileSync`/`spawn` with explicit args and piping without invoking a shell.
- Provide a cross-platform fallback when `tar`/`cp` is unavailable (e.g., revert to per-file materialization, or use a JS tar extractor, or gate this path behind a platform check).
- If the project is intentionally POSIX-only, document it clearly and fail fast with an actionable error message.
### Focus areas in code
- src/analyzers/react/diff.ts[1000-1016]
- src/analyzers/react/diff.ts[1043-1113]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. CoW clone temp leak 🐞 Bug ⛯ Reliability
Description
cloneSnapshotWithOverlay() can throw after creating/cloning targetSnap (e.g., during git diff or
tar extraction), but there is no try/finally to delete targetSnap on failure.
materializeTwoGitTreeSnapshots() catches and falls back, so the leaked directory is easy to miss and
can accumulate over time.
Code

src/analyzers/react/diff.ts[R1049-1113]

+  const targetSnap = fs.mkdtempSync(
+    path.join(os.tmpdir(), 'foresthouse-react-diff-'),
+  )
+  fs.rmSync(targetSnap, { recursive: true })
+
+  // CoW clone (APFS on macOS, reflink on Btrfs/XFS)
+  if (process.platform === 'darwin') {
+    execFileSync('cp', ['-cR', sourceSnap, targetSnap], { stdio: 'ignore' })
+  } else {
+    execFileSync('cp', ['-a', '--reflink=auto', sourceSnap, targetSnap], {
+      stdio: 'ignore',
+    })
+  }
-  return snapshotRoot
-}
+  // Find changed files between the two trees
+  const diffOutput = execFileSync(
+    'git',
+    [
+      '-C',
+      repositoryRoot,
+      'diff',
+      '--name-status',
+      '--no-renames',
+      '-z',
+      targetTree,
+      sourceTree,
+    ],
+    { maxBuffer: GIT_EXEC_MAX_BUFFER, encoding: 'utf8' },
+  )
-function ensureSnapshotDirectory(snapshotRoot: string, filePath: string): void {
-  const parentDirectory = path.dirname(filePath)
+  if (diffOutput.length === 0) {
+    return targetSnap
+  }
-  if (parentDirectory === '.') {
-    return
+  // Parse null-separated output: status\0path\0status\0path\0...
+  const parts = diffOutput.split('\0')
+  const filesToDelete: string[] = []
+  const filesToRestore: string[] = []
+
+  for (let i = 0; i + 1 < parts.length; i += 2) {
+    const status = parts[i]
+    const file = parts[i + 1]
+
+    if (status === 'A') {
+      // Added in source (after) — doesn't exist in target (before)
+      filesToDelete.push(file)
+    } else if (status === 'D' || status === 'M') {
+      // Deleted or modified in source — restore target (before) version
+      filesToRestore.push(file)
+    }
}
-  fs.mkdirSync(path.join(snapshotRoot, ...parentDirectory.split('/')), {
-    recursive: true,
-  })
+  for (const file of filesToDelete) {
+    fs.rmSync(path.join(targetSnap, file), { force: true })
+  }
+
+  if (filesToRestore.length > 0) {
+    // Extract only the changed files from the before tree
+    const archive = execFileSync(
+      'git',
+      ['-C', repositoryRoot, 'archive', '--format=tar', targetTree, '--', ...filesToRestore],
+      { maxBuffer: GIT_EXEC_MAX_BUFFER },
+    )
+    execFileSync('tar', ['-x', '-C', targetSnap], { input: archive })
+  }
Evidence
targetSnap is created and then populated via cp; subsequent operations can throw, and no cleanup
is performed in cloneSnapshotWithOverlay before the error propagates to the caller's
catch/fallback.

src/analyzers/react/diff.ts[1019-1041]
src/analyzers/react/diff.ts[1043-1116]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`cloneSnapshotWithOverlay()` does not remove `targetSnap` when it fails mid-way, leaking temp directories.
## Issue Context
Failures can occur after cloning succeeds but before the function returns (e.g., `git diff` errors, `git archive` errors, `tar` extraction errors).
## Fix Focus Areas
- Wrap the body of `cloneSnapshotWithOverlay()` in a `try { ...; return targetSnap } catch (e) { fs.rmSync(targetSnap, { recursive: true, force: true }); throw e }`.
- Ensure the partially-created path is removed even if it currently does not exist (use `force: true`).
### Focus areas in code
- src/analyzers/react/diff.ts[1043-1116]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

async3619 and others added 2 commits March 28, 2026 15:46
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 28, 2026

Codecov Report

❌ Patch coverage is 71.73913% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.97%. Comparing base (4ae9e65) to head (f1b39a7).
⚠️ Report is 3 commits behind head on dev.

Files with missing lines Patch % Lines
src/analyzers/react/diff.ts 71.73% 7 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #109      +/-   ##
==========================================
- Coverage   66.08%   65.97%   -0.11%     
==========================================
  Files          39       39              
  Lines        2586     2613      +27     
  Branches      857      862       +5     
==========================================
+ Hits         1709     1724      +15     
- Misses        520      526       +6     
- Partials      357      363       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 28, 2026

Merging this PR will improve performance by 59.35%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 2 improved benchmarks
✅ 16 untouched benchmarks

Performance Changes

Benchmark BASE HEAD Efficiency
full component and hook diff 25.8 ms 16.2 ms +59.35%
component-only diff 26.6 ms 16.9 ms +57.43%

Comparing async3619/debug-slow-dev (f1b39a7) with dev (4ae9e65)

Open in CodSpeed

@async3619 async3619 merged commit b38bf71 into dev Mar 28, 2026
5 checks passed
github-actions bot pushed a commit that referenced this pull request Mar 28, 2026
## [1.2.0-dev.1](v1.1.1-dev.1...v1.2.0-dev.1) (2026-03-28)

### Features

* **react:** add usage diff and benchmark coverage ([#107](#107)) ([4ae9e65](4ae9e65))

### Bug Fixes

* **react:** resolve builtin elements from namespace-imported styled-components ([#110](#110)) ([2a76067](2a76067))

### Performance

* **react:** optimize git tree materialization for diff ([#109](#109)) ([b38bf71](b38bf71))
@async3619
Copy link
Copy Markdown
Owner Author

🎉 This PR is included in version 1.2.0-dev.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant