fix: grep false negatives, output mangling, and truncation annotations#791
Open
BadassBison wants to merge 2 commits intortk-ai:masterfrom
Open
Conversation
Three fixes for issues causing AI agents to burn tokens on retry loops: 1. grep: add --no-ignore to rg invocation (src/grep_cmd.rs) rg respects .gitignore by default while grep -r does not. In large monorepos (~83K files), this caused rg to return 0 matches for files in gitignored directories, producing false negatives. AI agents then concluded files/methods didn't exist and drew wrong conclusions. 2. grep: passthrough for small result sets (src/grep_cmd.rs) Results with <=50 matches now output raw file:line:content format instead of the grouped "X matches in YF:" format. The grouped format confused AI agents which couldn't parse it, triggering 2-4 retry attempts per search (each burning 500-2000 tokens). The grouped format is preserved for >50 matches where token savings matter. 3. smart_truncate: clean truncation without annotations (src/filter.rs) Replaced the "smart" truncation that inserted synthetic "// ... N lines omitted" comment markers throughout file content. AI agents treated these as actual code, got confused, and retried with alternative commands. New behavior: clean first-N-lines truncation with "[X more lines]" at the end only. Evidence from a real session (1,633 RTK commands): - grep had lowest savings rate (9.3%) but highest retry cost - ~15 retry patterns observed, each 2-4 extra tool calls - ~10 false negative searches led to wrong analysis conclusions - Estimated 200-500K tokens burned on retries - Net token impact was negative for grep-heavy workflows
Collaborator
📊 Automated PR Analysis
SummaryFixes three issues in grep and smart_truncate that caused AI agents to waste tokens on retry loops: adds --no-ignore to rg so gitignored files aren't silently skipped, passes through raw grep output for small result sets (<=50 matches) instead of a grouped format that confused AI parsers, and replaces synthetic '// ... N lines omitted' truncation markers with clean first-N-lines truncation plus a single '[X more lines]' suffix. Review Checklist
Analyzed automatically by wshm · This is an automated analysis, not a human review. |
Update all docs referencing grep's output strategy to reflect the new
behavior: raw passthrough for <=50 matches, grouped format for >50.
Files updated: CLAUDE.md, README.md, README_{fr,es,ja,ko,zh}.md,
INSTALL.md, ARCHITECTURE.md, docs/AUDIT_GUIDE.md, docs/FEATURES.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes three issues where RTK's output filtering causes AI agents (Claude Code) to burn extra tokens on retry loops, producing net-negative token impact during analysis-heavy workflows.
--no-ignoreto rg — prevents false negatives in repos with.gitignorefile:line:contentformat AI agents can parse// ... N lines omittedannotations that break AI parsingProblem
Observed in a real session across a large Rails monorepo (~83K files, 1,633 RTK commands):
rgrespects.gitignoreby default,grep -rdoesn't"217 matches in 1F:"format// ... 81 lines omittedin file readssmart_truncateinserts synthetic comment markersQuantified impact: grep had the lowest savings rate (9.3%) but the highest retry cost. Estimated 200-500K tokens burned on retries across ~15 retry patterns, each requiring 2-4 extra tool calls.
Evidence
Screenshots from real Claude Code sessions showing the retry pattern:
Session metrics:
The retry loop in detail
When Claude runs
grep -rn "def test_" apps/through RTK, the output gets reformatted to:Claude can't extract the data it needs from this format. It then retries with workarounds:
Each retry burns 500-2000 tokens. Across 15 instances in a single session, this adds up to 200-500K wasted tokens.
False negatives — the most damaging case
RTK's grep returned "0 matches" for patterns that actually existed because
rgrespects.gitignorewhilegrep -rdoes not. In a large monorepo, this caused:Truncation annotations break file parsing
When
head -5 file.rbgets rewritten tortk read file.rb --max-lines 5, the oldsmart_truncatefunction inserted synthetic comment markers like// ... 79 lines omittedin the middle of the output. AI agents treated these as actual file content, got confused about the file structure, and retried with alternative commands (Read tool, python, etc.), doubling the token cost.Changes
1.
src/grep_cmd.rs—--no-ignoreflagAdded
--no-ignoreto therginvocation so it doesn't skip files listed in.gitignore. This matchesgrep -rbehavior and eliminates false negatives in repos where test files, build artifacts, or generated code live in gitignored directories.2.
src/grep_cmd.rs— Passthrough for small resultsResults with <=50 matches now output raw
file:line:contentformat (standard grep output that AI agents already know how to parse). The grouped"X matches in Y files:"format is preserved only for >50 matches where token savings are meaningful. For small result sets, the token savings from grouping are negligible (~9.3%) but the retry cost from mangling is high (500-2000 tokens per retry).3.
src/filter.rs— Clean truncation insmart_truncateReplaced the "smart" truncation logic that scattered
" // ... N lines omitted"markers throughout file content with clean first-N-lines truncation. A single[X more lines]marker appears at the end only. The old annotations were treated as actual code by AI agents, causing parsing confusion and retry loops.Why this fix
--no-ignoreon rg: tracing the exact code pathWhen Claude runs
grep "def test_" apps/tportal/test/selenium/:If any files under that path are covered by a
.gitignoreentry,rgsilently skips them.grep -rwould not. The output is empty, andgrep_cmd.rs:60-65prints"0 matches".Four alternative explanations were considered and ruled out:
\|→|). The user's patterns (def test_, simple strings) have no BRE/PCRE divergence.rgandgrepwould fail on a bad path, but the files were confirmed to exist.grep_cmd::run()executes — meaning Clap succeeded.rgis given an explicit file path, it does NOT apply gitignore rules — only during directory traversal. So the false negatives specifically affect directory searches.Only the gitignore explanation produces the observed behavior:
rgreturns empty stdout on a directory search wheregrep -rreturns matches.Honest tradeoff:
--no-ignoreis broad — it disables.gitignore,.ignore, AND.rgignore. This meansrgwill now traversenode_modules/,vendor/, etc., which is potentially slower. A more surgical option is--no-ignore-vcs(only disables.gitignore/.hgignore). However, the goal is to matchgrep -rbehavior exactly, andgrep -ralso traversesnode_modules. Performance-wise,rgis fast enough that the hit is negligible compared to the cost of a false negative (2-4 retry commands at 500-2000 tokens each).Passthrough for <=50 matches: the quantitative argument
For a 10-match result (typical small search):
For a 200-match result (large search):
The crossover point is where savings exceed retry cost. From the session data, if even 30% of small-result grep calls trigger retries:
At 50 matches (~2,500 tokens raw, ~1,500 grouped, ~1,000 saved), savings-per-call roughly equals single-retry cost. Below 50, retry risk outweighs savings. Above 50, savings dominate.
Why not disable grep filtering entirely? Because large results DO benefit from grouping. The user's
rtk findsaved 3.2M tokens at 73%. Filtering works when there's genuinely large output. The fix targets the specific range where it's counterproductive.Clean truncation: the old behavior was fundamentally broken
The old code inserted
" // ... 87 lines omitted"as a code comment mid-file. This is wrong for two independent reasons:Language mismatch. The marker uses
//comment syntax, but truncated files could be Ruby (#comments), Python (#), YAML (#), Shell (#), etc. An AI reading a Ruby file sees// ... 87 lines omittedand interprets it as invalid syntax, not a truncation marker.Unpredictable placement. The old code scattered annotations throughout the output based on "structural importance" heuristics. An AI agent encounters these markers at unpredictable positions, breaking any line-by-line parsing logic.
The new
[X more lines]format is:Why not truncate silently with no marker? An AI agent needs to know truncation occurred. Without a marker, it might assume it saw the full file and draw incorrect conclusions.
[X more lines]communicates truncation without being confused with file content.What was NOT changed (and why)
sedis already inIGNORED_PREFIXES— never rewritten by RTK. The user's sed issues are downstream effects of grep/cat problems.cat→rtk readstill usesFilterLevel::Minimal— the minimal filter (strips blank lines/comments) had 28.8% savings across 214 calls (~1.5M tokens saved). The user's complaint was about the// ... omittedannotations (fixed), not the filtering itself.Tests
test_smart_truncate_no_annotations— verifies no// ...markers in outputtest_smart_truncate_no_truncation_when_under_limit— no truncation when content fitstest_smart_truncate_exact_limit— edge case at exact line counttest_rg_no_ignore_flag_accepted— verifies rg accepts the new flagTest plan
cargo fmt --all && cargo clippy --all-targets && cargo test --allrtk grep "fn run" src/with <50 results outputs rawfile:line:contentformatrtk read src/main.rs --max-lines 5shows clean truncation without// ...markers.gitignored directories