Skip to content

fix: store tree hash in checkpoint metadata for linkage resilience#840

Open
peyton-alt wants to merge 4 commits intomainfrom
peyton/ent-834-tree-hash-checkpoint-linkage
Open

fix: store tree hash in checkpoint metadata for linkage resilience#840
peyton-alt wants to merge 4 commits intomainfrom
peyton/ent-834-tree-hash-checkpoint-linkage

Conversation

@peyton-alt
Copy link
Copy Markdown
Contributor

@peyton-alt peyton-alt commented Apr 3, 2026

Summary

Store the commit's tree hash in checkpoint metadata during condensation. This is the CLI-side of the fix for #834.

Problem (#834)

When users rewrite git history (git rebase -i reword, git filter-branch), the Entire-Checkpoint trailer is stripped from the commit message. The checkpoint data still exists on entire/checkpoints/v1, but the rewritten commit no longer points to it. The user loses their agent label and session link on entire.io.

How tree hash fixes this

A commit's tree hash represents its code content. History rewrites change the commit SHA and can strip the message, but the tree hash stays the same (same code = same tree). By storing the tree hash during condensation, the web-side can match rewritten commits to their original checkpoints via tree hash when no trailer is found.

Example: User commits with trailer → checkpoint stored with tree_hash: abc123 → user runs git rebase -i reword → new commit has different SHA, no trailer, but same tree hash abc123 → web webhook matches tree hash → linkage restored.

What's in this PR (CLI-side)

  • Add tree_hash field to CommittedMetadata and WriteCommittedOptions
  • Populate tree hash from the HEAD commit during PostCommit condensation
  • Tree hash is written to metadata.json on entire/checkpoints/v1

What's in the companion PR (web-side, entireio/entire.io)

  • DB migration: tree_hash column on repo_checkpoints with index
  • Webhook: fallback tree hash lookup for commits without trailers
  • Webhook: lazy backfill tree hash for existing checkpoints

Also in this PR

Agent-initiated git revert/git cherry-pick now gets a checkpoint trailer when an agent session is ACTIVE. Previously prepare-commit-msg unconditionally skipped during sequence operations. This is a separate improvement, not part of the #834 fix.

Testing

  • TestWriteCommitted_IncludesTreeHash — tree hash stored and read back from metadata
  • TestShadowStrategy_PrepareCommitMsg_AgentRevertGetsTrailer — agent revert gets trailer
  • TestShadowStrategy_PrepareCommitMsg_UserRevertSkipped — user revert still skipped
  • Full CI passes (unit + integration + E2E canary)

🤖 Generated with Claude Code


Note

Medium Risk
Adds a new persisted tree_hash field to committed checkpoint metadata and changes git hook behavior during sequence operations, which could affect checkpoint linkage and when trailers are added to commits.

Overview
Improves checkpoint linkage resilience by persisting the condensed commit’s git tree hash (tree_hash) into session-level committed metadata (WriteCommittedOptions/CommittedMetadata) and wiring it through manual-commit condensation so rewritten commits can be re-associated even if the Entire-Checkpoint trailer is lost.

Adjusts manual-commit prepare-commit-msg behavior so rebase/cherry-pick/revert operations are only skipped when there is no ACTIVE agent session in the worktree; agent-initiated reverts now get checkpoint trailers while user-initiated ones remain unchanged. Adds focused tests covering tree_hash round-trip persistence and the new revert trailer behavior.

Written by Cursor Bugbot for commit 9164d8e. Configure here.

peyton-alt and others added 2 commits April 2, 2026 15:03
When an agent runs git revert or cherry-pick as part of its work, the
commit should be checkpointed. Previously prepare-commit-msg
unconditionally skipped during sequence operations, making the agent's
work invisible to Entire.

Now checks for active sessions: if an agent session is ACTIVE, the
operation is agent-initiated and gets a trailer. If no active session,
it's user-initiated and is skipped as before.

Part of fix for #834.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 85df9ac94bc7
Add tree_hash field to committed checkpoint metadata. Records the git
tree hash of the commit being condensed, enabling fallback checkpoint
lookup by tree hash when the Entire-Checkpoint trailer is stripped by
git history rewrites (rebase, filter-branch, amend).

Part of fix for #834.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 77773a25069e
Copilot AI review requested due to automatic review settings April 3, 2026 00:25
@peyton-alt peyton-alt requested a review from a team as a code owner April 3, 2026 00:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Stores a commit’s git tree hash in checkpoint metadata during condensation so the web side can re-link checkpoints after history rewrites that drop the Entire-Checkpoint trailer, and tweaks prepare-commit-msg behavior so agent-driven revert/cherry-pick operations still get checkpoint trailers when a session is active.

Changes:

  • Add tree_hash to committed checkpoint metadata (CommittedMetadata, WriteCommittedOptions) and populate it during PostCommit condensation from the HEAD commit’s tree hash.
  • Allow prepare-commit-msg to proceed during git sequence operations (revert/cherry-pick/rebase) when an ACTIVE session exists in the current worktree; keep skipping when no active session.
  • Add unit tests for tree_hash persistence and the new revert trailer behavior split by agent-active vs user/manual.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
cmd/entire/cli/strategy/manual_commit_test.go Adds tests ensuring agent revert gets a trailer when session is ACTIVE and user revert is skipped when not active.
cmd/entire/cli/strategy/manual_commit_session.go Adds helper to detect whether any session in the current worktree is ACTIVE.
cmd/entire/cli/strategy/manual_commit_hooks.go Adjusts PrepareCommitMsg sequence-operation skip logic; passes commit tree hash into condensation options.
cmd/entire/cli/strategy/manual_commit_condensation.go Threads treeHash through condensation options into committed checkpoint write options.
cmd/entire/cli/checkpoint/committed.go Writes TreeHash into per-session committed metadata.json.
cmd/entire/cli/checkpoint/checkpoint.go Extends checkpoint option/metadata structs to include TreeHash serialized as tree_hash.
cmd/entire/cli/checkpoint/checkpoint_test.go Adds coverage that tree_hash is written and read back from committed metadata.

Comment on lines +844 to +848
metaDir := filepath.Join(".entire", "metadata", "agent-revert-session")
require.NoError(t, os.MkdirAll(filepath.Join(dir, metaDir), 0o755))
transcript := `{"type":"human","message":{"content":"revert the change"}}` + "\n" +
`{"type":"assistant","message":{"content":"I'll revert that"}}` + "\n"
require.NoError(t, os.WriteFile(filepath.Join(dir, metaDir, "full.jsonl"), []byte(transcript), 0o644))
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test, metaDir is built with filepath.Join and then passed as StepContext.MetadataDir. MetadataDir is a repo-relative path used for git tree entries/commit trailers, so it needs forward slashes regardless of OS. Using filepath.Join will produce backslashes on Windows, which can break shadow-branch metadata paths and make the test (and any code paths it exercises) behave differently across platforms. Consider keeping MetadataDir as a slash-separated string (e.g., ".entire/metadata/") and deriving an absolute filesystem path separately (e.g., filepath.FromSlash + filepath.Join).

Suggested change
metaDir := filepath.Join(".entire", "metadata", "agent-revert-session")
require.NoError(t, os.MkdirAll(filepath.Join(dir, metaDir), 0o755))
transcript := `{"type":"human","message":{"content":"revert the change"}}` + "\n" +
`{"type":"assistant","message":{"content":"I'll revert that"}}` + "\n"
require.NoError(t, os.WriteFile(filepath.Join(dir, metaDir, "full.jsonl"), []byte(transcript), 0o644))
metaDir := ".entire/metadata/agent-revert-session"
metaDirFS := filepath.Join(dir, filepath.FromSlash(metaDir))
require.NoError(t, os.MkdirAll(metaDirFS, 0o755))
transcript := `{"type":"human","message":{"content":"revert the change"}}` + "\n" +
`{"type":"assistant","message":{"content":"I'll revert that"}}` + "\n"
require.NoError(t, os.WriteFile(filepath.Join(metaDirFS, "full.jsonl"), []byte(transcript), 0o644))

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

)
return nil
}
logging.Debug(logCtx, "prepare-commit-msg: sequence operation with active session, proceeding",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agent revert trailer added but condensation skipped in PostCommit

Medium Severity

PrepareCommitMsg now adds a checkpoint trailer during agent-initiated revert/cherry-pick (active session + sequence operation), but the PostCommit handler still sets IsRebaseInProgress: true for all sequence operations via isGitSequenceOperation(ctx). Since REVERT_HEAD/CHERRY_PICK_HEAD remain present during the post-commit hook, the state machine receives IsRebaseInProgress: true and emits no ActionCondense, so no checkpoint data is ever written. The commit ends up with a dangling Entire-Checkpoint trailer pointing to a nonexistent checkpoint ID.

Additional Locations (1)
Fix in Cursor Fix in Web

peyton-alt and others added 2 commits April 2, 2026 17:36
Keep both the PR 826 fail-closed test and our new revert tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: 1acddcbdf747
- Add debug logging to hasActiveSessionInWorktree error paths
- Remove unrelated files (greetings.md, agent configs) from PR

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
peyton-alt added a commit that referenced this pull request Apr 3, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pjbgf added a commit that referenced this pull request Apr 3, 2026
Add checkpoint linkage preservation after history rewrites (#840) and
fail-closed content detection in prepare-commit-msg (#826). Update
release date to 2026-04-03.

Signed-off-by: Paulo Gomes <paulo@entire.io>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 0154164ac0f9
pjbgf added a commit that referenced this pull request Apr 3, 2026
Add checkpoint linkage preservation after history rewrites (#840) and
fail-closed content detection in prepare-commit-msg (#826). Update
release date to 2026-04-03.

Signed-off-by: Paulo Gomes <paulo@entire.io>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 0154164ac0f9
pjbgf added a commit that referenced this pull request Apr 3, 2026
Add checkpoint linkage preservation after history rewrites (#840) and
fail-closed content detection in prepare-commit-msg (#826). Update
release date to 2026-04-03.

Signed-off-by: Paulo Gomes <paulo@entire.io>
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 0154164ac0f9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants