Skip to content

Conversation

ali90h
Copy link

@ali90h ali90h commented Aug 18, 2025

What

Add library-level support for --ignore-rev in gix-blame:

  • Options gains a pub(crate) field ignored_revs: Option<HashSet<ObjectId>>
  • New public builder: Options::with_ignored_revisions(..)
  • Transparent attribution to the nearest non-ignored ancestor (no post-filtering)

Why

Enable ignore-revision semantics at the API level while preserving existing blame behavior.

How

  • Effective-commit resolution performed at output time
  • Walk through consecutive ignored ancestors transparently
  • No changes to rename/rewrite detection or parent selection rules

Verification

  • cargo clippy -p gix-blame --all-targets -- -D warnings
  • cargo test -p gix-blame (all new + existing tests green) ✅
  • Public API: only one new public item Options::with_ignored_revisions(..)

Git oracle (reference)

Before vs After git blame --ignore-rev 94feb0c5e3086e1210d0f722ee76b25b25bc921b shows re-attribution from B → A.

Scope

gix-blame only; no MSRV bump; no release changes; minimal diff.

AI Disclosure

  • AI-Assisted: Claude Code
  • Agent Mode: No (no autonomous actions)
  • Multi-line Completions: Yes — some suggestions were pasted, then edited and reviewed by me
    all code human-edited and verified with clippy & the test suite

Part of #2064 (library side).

@Byron
Copy link
Member

Byron commented Aug 19, 2025

Thanks for giving it a shot!

Could you please disclose the use of AI in this PR? Is it agent mode and/or multi-line completions?
I am also trying to figure out how Git trailers can be used for marking these, but didn't find the right resources yet.

Thanks again.

@ali90h
Copy link
Author

ali90h commented Aug 19, 2025

Thanks for the review, Byron.

I used Claude Code for planning and for suggesting some Rust code blocks.
I pasted a few suggestions and then edited/reviewed them myself, and verified the result with clippy and the test suite.
No agent mode or autonomous actions were used.

@ali90h ali90h force-pushed the fix/blame-ignore-rev-core branch from dd3b287 to 9700b2a Compare August 19, 2025 06:14
@ali90h
Copy link
Author

ali90h commented Aug 19, 2025

Note: This PR is Phase-1 (library). Phase-2 (CLI flags --ignore-rev / --ignore-revs-file) will follow in a separate PR (Refs #2064).

@ali90h ali90h marked this pull request as ready for review August 19, 2025 06:32
@cruessler
Copy link
Contributor

Thanks for the PR!

How do you prefer feedback? I can either start early (which would probably mean sometime in the next few days) or wait until you mark the PR as ready.

@ali90h
Copy link
Author

ali90h commented Aug 19, 2025

Thanks for the PR review offer!
The PR is ready for review now — early feedback is welcome.
Scope: library-only for gix-blame; no MSRV change; the only new public API is Options::with_ignored_revisions(..) and ignored_revs stays pub(crate).
What I’d love feedback on:
– Correctness of skipping ignored commits (long chains, merges, root)
– Placement of “effective commit” resolution at output time
– API shape/naming & visibility
– Test coverage (8 new tests)
Phase-2 (CLI flags --ignore-rev / --ignore-revs-file) is tracked separately and will follow.
Thanks!

ali90h added 5 commits August 19, 2025 18:36
AI-Assisted: Claude Code
AI-Agent-Mode: no
AI-MultiLine-Completions: yes (pasted then edited/reviewed)
human-reviewed; clippy+tests green:
@ali90h ali90h force-pushed the fix/blame-ignore-rev-core branch from 9700b2a to bee12c4 Compare August 19, 2025 15:37
@Byron Byron requested a review from Copilot August 20, 2025 14:27
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds library-level support for --ignore-rev functionality to gix-blame. The feature allows users to specify a set of commit revisions to ignore during blame attribution, transparently passing blame to the nearest non-ignored ancestor.

  • Add ignored_revs field to Options struct with public builder method with_ignored_revisions()
  • Implement effective commit resolution by walking back through ignored commits
  • Refactor existing test code to use builder pattern for Options initialization

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
gix-blame/src/types.rs Add ignored_revs field and with_ignored_revisions() builder method to Options
gix-blame/src/file/function.rs Add find_effective_commit() function and integrate ignore revision logic into blame attribution
gix-blame/tests/blame.rs Add comprehensive test coverage and refactor existing tests to use builder pattern
gix-blame/tests/ignore_revisions_perf.rs Add performance test to ensure zero-cost abstraction
tests/it/src/commands/blame_copy_royal.rs Refactor to use builder pattern for Options
src/plumbing/main.rs Refactor to use builder pattern for Options

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

}

let mut current = commit_id;
let mut visited = std::collections::HashSet::new();
Copy link
Preview

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a SmallVec or similar optimized collection instead of HashSet for cycle detection. Most commit ancestry chains are relatively short, and a simple vector with linear search might be more efficient for small sets.

Suggested change
let mut visited = std::collections::HashSet::new();
let mut visited: SmallVec<[gix_hash::ObjectId; 8]> = SmallVec::new();

Copilot uses AI. Check for mistakes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally used a HashSet here.
The hot operation is membership checks during the DAG walk; HashSet keeps them ~O(1), while SmallVec would be O(n) and can degrade on longer ignore chains / merges.
As also noted in my analysis, the extra overhead is negligible compared to I/O, and the cycle set is unbounded in principle.
I’d keep HashSet for predictable worst-case behavior.

// In a more complete implementation, we might need to handle multiple parents differently
current = parent_ids[0].0;
}

Copy link
Preview

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment highlights incomplete merge commit handling. The current implementation only follows the first parent, which may not provide correct blame attribution for lines that originated from non-first parents in merge commits. Consider documenting this limitation more prominently or implementing proper multi-parent traversal.

Suggested change
// Use a stack for DFS traversal of parents
let mut stack = vec![commit_id];
while let Some(current) = stack.pop() {
if !ignored_set.contains(&current) {
return Ok(Some(current));
}
// Prevent infinite loops
if !visited.insert(current) {
continue;
}
last_commit = current; // Track the current commit
let commit = find_commit(cache, odb, &current, buf)?;
let parent_ids: ParentIds = collect_parents(commit, odb, cache, &mut local_buf)?;
if parent_ids.is_empty() {
// Root commit - can't go further back
// For ignored root commits, we have no choice but to attribute to the root itself
return Ok(Some(current));
}
// Push all parents onto the stack for further traversal
for parent in parent_ids.iter() {
if !visited.contains(&parent.0) {
stack.push(parent.0);
}
}
if !visited.contains(&parent.0) && !ignored_set.contains(&parent.0) {
return Ok(Some(parent.0));
}
}
// If all parents are ignored, add them to the queue for further traversal.
// Use a stack for DFS or a queue for BFS; here we use a stack for DFS.
// Push all parents onto the stack for further processing.
// Remove the current = parent_ids[0].0; line and instead use a stack.
// We'll need to refactor the loop to use a stack.
// Refactor loop to use a stack for multi-parent traversal.
// (See below for full loop refactor.)

Copilot uses AI. Check for mistakes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is true that the current code only follows the first parent, but that’s deliberate here. This function’s job is to skip ignored commits and return the first non-ignored ancestor—not to do a full multi-parent walk. The proposed rewrite adds a lot of complexity, includes dead/commented code, introduces an early-return bug that short-circuits the stack walk, and can make results depend on parent order. In the blame context, simple and deterministic beats exhaustive, and Git’s first-parent precedence fits that.
I'd keep the current approach.
If we ever need multi-parent support, we should add a targeted, deterministic traversal with clear semantics and tests.

}
// If we couldn't find an effective commit, don't create a blame entry
Copy link
Preview

Copilot AI Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code path silently drops blame entries when no effective commit is found. Consider logging this scenario or adding a comment explaining when this would occur and whether it's expected behavior.

Copilot uses AI. Check for mistakes.

Copy link
Author

@ali90h ali90h Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok(None) means all reachable ancestors are ignored; we intentionally omit a blame entry to keep attribution deterministic and avoid manufacturing a misleading entry. Added an explicit comment. Done in ali90h/gitoxide@26caa1b64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants