fix: grep false negatives, output mangling, and truncation annotations by BadassBison · Pull Request #791 · rtk-ai/rtk

BadassBison · 2026-03-23T15:50:47Z

Summary

Fixes three issues where RTK's output filtering causes AI agents (Claude Code) to burn extra tokens on retry loops, producing net-negative token impact during analysis-heavy workflows.

grep: add --no-ignore to rg — prevents false negatives in repos with .gitignore
grep: passthrough for small results (<=50 matches) — preserves standard file:line:content format AI agents can parse
smart_truncate: clean truncation — removes synthetic // ... N lines omitted annotations that break AI parsing

Problem

Observed in a real session across a large Rails monorepo (~83K files, 1,633 RTK commands):

Issue	Root Cause	Impact
grep returns "0 matches" for existing files	`rg` respects `.gitignore` by default, `grep -r` doesn't	~10 false negatives led to wrong analysis conclusions
grep output in `"217 matches in 1F:"` format	Always reformatted, even for 4 matches	AI agents can't parse it, retry 2-4 times each
`// ... 81 lines omitted` in file reads	`smart_truncate` inserts synthetic comment markers	AI treats annotations as code, retries with alternative commands

Quantified impact: grep had the lowest savings rate (9.3%) but the highest retry cost. Estimated 200-500K tokens burned on retries across ~15 retry patterns, each requiring 2-4 extra tool calls.

Evidence

Screenshots from real Claude Code sessions showing the retry pattern:

Claude detects mangled output: "RTK is eating the grep results. Let me use find + xargs instead"
Claude switches to Python: "RTK is interfering with grep. Let me use a direct python approach"
Claude retries sed: "The sed didn't work with RTK. Let me use python"

Session metrics:

Total RTK commands:    1,633
Tokens saved by RTK:   5.5M (44.5%)
Est. tokens on retries: 200-500K
grep savings rate:     9.3% (lowest of all commands)
grep retry instances:  ~15 (highest retry cost)

The retry loop in detail

When Claude runs grep -rn "def test_" apps/ through RTK, the output gets reformatted to:

217 matches in 1F:

[file] /.../some_file.rb (217):
     2: [] 1420, appfolio-developers
    24: [] 1496, appfolio-developers

Claude can't extract the data it needs from this format. It then retries with workarounds:

# Attempt 1 (rewritten by RTK, output mangled):
grep -rn "def test_" apps/tportal/test/selenium/

# Attempt 2 (Claude tries find + xargs to bypass RTK):
find apps/tportal/test -name "*.rb" -exec grep -l "def test_" {} \;

# Attempt 3 (Claude switches to python):
python3 -c "import os, re; ..."

# Finally gets actual results on attempt 3-4

Each retry burns 500-2000 tokens. Across 15 instances in a single session, this adds up to 200-500K wasted tokens.

False negatives — the most damaging case

RTK's grep returned "0 matches" for patterns that actually existed because rg respects .gitignore while grep -r does not. In a large monorepo, this caused:

13 test mutes marked as "file not found" when 10 actually existed in different subdirectories
Hours of rework to correct the false conclusions
Multiple agent retries across 10 parallel subagents, each hitting the same filtering issue

Truncation annotations break file parsing

When head -5 file.rb gets rewritten to rtk read file.rb --max-lines 5, the old smart_truncate function inserted synthetic comment markers like // ... 79 lines omitted in the middle of the output. AI agents treated these as actual file content, got confused about the file structure, and retried with alternative commands (Read tool, python, etc.), doubling the token cost.

Changes

1. `src/grep_cmd.rs` — `--no-ignore` flag

Added --no-ignore to the rg invocation so it doesn't skip files listed in .gitignore. This matches grep -r behavior and eliminates false negatives in repos where test files, build artifacts, or generated code live in gitignored directories.

2. `src/grep_cmd.rs` — Passthrough for small results

Results with <=50 matches now output raw file:line:content format (standard grep output that AI agents already know how to parse). The grouped "X matches in Y files:" format is preserved only for >50 matches where token savings are meaningful. For small result sets, the token savings from grouping are negligible (~9.3%) but the retry cost from mangling is high (500-2000 tokens per retry).

3. `src/filter.rs` — Clean truncation in `smart_truncate`

Replaced the "smart" truncation logic that scattered " // ... N lines omitted" markers throughout file content with clean first-N-lines truncation. A single [X more lines] marker appears at the end only. The old annotations were treated as actual code by AI agents, causing parsing confusion and retry loops.

Why this fix

`--no-ignore` on rg: tracing the exact code path

When Claude runs grep "def test_" apps/tportal/test/selenium/:

grep "def test_" apps/tportal/test/selenium/
  → hook rewrites to: rtk grep "def test_" apps/tportal/test/selenium/
  → Clap parses: pattern="def test_", path="apps/tportal/test/selenium/"
  → grep_cmd::run() executes: rg -n --no-heading "def test_" apps/tportal/test/selenium/

If any files under that path are covered by a .gitignore entry, rg silently skips them. grep -r would not. The output is empty, and grep_cmd.rs:60-65 prints "0 matches".

Four alternative explanations were considered and ruled out:

Regex difference (BRE vs PCRE): Already handled on line 26 (\| → |). The user's patterns (def test_, simple strings) have no BRE/PCRE divergence.
Wrong cwd: RTK doesn't change cwd. Both rg and grep would fail on a bad path, but the files were confirmed to exist.
Clap parse failure falling through to raw grep: This would give correct results (raw grep runs in the fallback path), not false negatives. The false negative only happens when grep_cmd::run() executes — meaning Clap succeeded.
Explicit file path vs directory search: When rg is given an explicit file path, it does NOT apply gitignore rules — only during directory traversal. So the false negatives specifically affect directory searches.

Only the gitignore explanation produces the observed behavior: rg returns empty stdout on a directory search where grep -r returns matches.

Honest tradeoff: --no-ignore is broad — it disables .gitignore, .ignore, AND .rgignore. This means rg will now traverse node_modules/, vendor/, etc., which is potentially slower. A more surgical option is --no-ignore-vcs (only disables .gitignore/.hgignore). However, the goal is to match grep -r behavior exactly, and grep -r also traverses node_modules. Performance-wise, rg is fast enough that the hit is negligible compared to the cost of a false negative (2-4 retry commands at 500-2000 tokens each).

Passthrough for <=50 matches: the quantitative argument

For a 10-match result (typical small search):

Raw output: ~500 tokens (10 lines × ~50 tokens)
Grouped format: ~450 tokens (headers + formatting + content)
Savings: ~50 tokens (10%)
Cost of ONE retry when agent can't parse grouped format: 500-2000 tokens
Net: -450 to -1950 tokens

For a 200-match result (large search):

Raw output: ~10,000 tokens
Grouped format: ~3,000 tokens
Savings: ~7,000 tokens (70%)
Cost of one retry: ~1,500 tokens
Net: +5,500 tokens even with a retry

The crossover point is where savings exceed retry cost. From the session data, if even 30% of small-result grep calls trigger retries:

142 grep calls × 30% retry rate = ~43 retries
43 retries × 1,000 tokens avg = 43,000 tokens burned
142 calls × 9.3% savings × ~500 avg tokens = ~6,600 tokens saved
Net: -36,400 tokens

At 50 matches (~2,500 tokens raw, ~1,500 grouped, ~1,000 saved), savings-per-call roughly equals single-retry cost. Below 50, retry risk outweighs savings. Above 50, savings dominate.

Why not disable grep filtering entirely? Because large results DO benefit from grouping. The user's rtk find saved 3.2M tokens at 73%. Filtering works when there's genuinely large output. The fix targets the specific range where it's counterproductive.

Clean truncation: the old behavior was fundamentally broken

The old code inserted " // ... 87 lines omitted" as a code comment mid-file. This is wrong for two independent reasons:

Language mismatch. The marker uses // comment syntax, but truncated files could be Ruby (# comments), Python (#), YAML (#), Shell (#), etc. An AI reading a Ruby file sees // ... 87 lines omitted and interprets it as invalid syntax, not a truncation marker.

Unpredictable placement. The old code scattered annotations throughout the output based on "structural importance" heuristics. An AI agent encounters these markers at unpredictable positions, breaking any line-by-line parsing logic.

The new [X more lines] format is:

Not valid syntax in any programming language (unambiguously metadata)
At the end only (predictable position)
Parseable by simple regex if needed

Why not truncate silently with no marker? An AI agent needs to know truncation occurred. Without a marker, it might assume it saw the full file and draw incorrect conclusions. [X more lines] communicates truncation without being confused with file content.

What was NOT changed (and why)

sed is already in IGNORED_PREFIXES — never rewritten by RTK. The user's sed issues are downstream effects of grep/cat problems.
cat → rtk read still uses FilterLevel::Minimal — the minimal filter (strips blank lines/comments) had 28.8% savings across 214 calls (~1.5M tokens saved). The user's complaint was about the // ... omitted annotations (fixed), not the filtering itself.
The 50-match threshold is hardcoded — a config option would be over-engineering for a reasonable default. If tuning is needed later, it's a one-line change.

Tests

test_smart_truncate_no_annotations — verifies no // ... markers in output
test_smart_truncate_no_truncation_when_under_limit — no truncation when content fits
test_smart_truncate_exact_limit — edge case at exact line count
test_rg_no_ignore_flag_accepted — verifies rg accepts the new flag

Test plan

cargo fmt --all && cargo clippy --all-targets && cargo test --all
Manual: rtk grep "fn run" src/ with <50 results outputs raw file:line:content format
Manual: rtk read src/main.rs --max-lines 5 shows clean truncation without // ... markers
Manual: verify grep finds files in .gitignored directories

Three fixes for issues causing AI agents to burn tokens on retry loops: 1. grep: add --no-ignore to rg invocation (src/grep_cmd.rs) rg respects .gitignore by default while grep -r does not. In large monorepos (~83K files), this caused rg to return 0 matches for files in gitignored directories, producing false negatives. AI agents then concluded files/methods didn't exist and drew wrong conclusions. 2. grep: passthrough for small result sets (src/grep_cmd.rs) Results with <=50 matches now output raw file:line:content format instead of the grouped "X matches in YF:" format. The grouped format confused AI agents which couldn't parse it, triggering 2-4 retry attempts per search (each burning 500-2000 tokens). The grouped format is preserved for >50 matches where token savings matter. 3. smart_truncate: clean truncation without annotations (src/filter.rs) Replaced the "smart" truncation that inserted synthetic "// ... N lines omitted" comment markers throughout file content. AI agents treated these as actual code, got confused, and retried with alternative commands. New behavior: clean first-N-lines truncation with "[X more lines]" at the end only. Evidence from a real session (1,633 RTK commands): - grep had lowest savings rate (9.3%) but highest retry cost - ~15 retry patterns observed, each 2-4 extra tool calls - ~10 false negative searches led to wrong analysis conclusions - Estimated 200-500K tokens burned on retries - Net token impact was negative for grep-heavy workflows

CLAassistant · 2026-03-23T15:50:55Z

All committers have signed the CLA.

pszymkowiak · 2026-03-23T15:51:10Z

[w] wshm · Automated triage by AI

📊 Automated PR Analysis


🐛 Type	`bug-fix`
🟡 Risk	`medium`

Summary

Fixes three issues in grep and smart_truncate that caused AI agents to waste tokens on retry loops: adds --no-ignore to rg so gitignored files aren't silently skipped, passes through raw grep output for small result sets (<=50 matches) instead of a grouped format that confused AI parsers, and replaces synthetic '// ... N lines omitted' truncation markers with clean first-N-lines truncation plus a single '[X more lines]' suffix.

Review Checklist

Tests present
Breaking change
Docs updated

Analyzed automatically by wshm · This is an automated analysis, not a human review.

Update all docs referencing grep's output strategy to reflect the new behavior: raw passthrough for <=50 matches, grouped format for >50. Files updated: CLAUDE.md, README.md, README_{fr,es,ja,ko,zh}.md, INSTALL.md, ARCHITECTURE.md, docs/AUDIT_GUIDE.md, docs/FEATURES.md

pszymkowiak added bug Something isn't working effort-medium 1-2 jours, quelques fichiers filter-quality Filter produces incorrect/truncated signal labels Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: grep false negatives, output mangling, and truncation annotations#791

fix: grep false negatives, output mangling, and truncation annotations#791
BadassBison wants to merge 2 commits intortk-ai:masterfrom
BadassBison:fix/grep-false-negatives-and-truncation-annotations

BadassBison commented Mar 23, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Mar 23, 2026 •

edited

Loading

Uh oh!

pszymkowiak commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

BadassBison commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Evidence

The retry loop in detail

False negatives — the most damaging case

Truncation annotations break file parsing

Changes

1. src/grep_cmd.rs — --no-ignore flag

2. src/grep_cmd.rs — Passthrough for small results

3. src/filter.rs — Clean truncation in smart_truncate

Why this fix

--no-ignore on rg: tracing the exact code path

Passthrough for <=50 matches: the quantitative argument

Clean truncation: the old behavior was fundamentally broken

What was NOT changed (and why)

Tests

Test plan

Uh oh!

CLAassistant commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pszymkowiak commented Mar 23, 2026

📊 Automated PR Analysis

Summary

Review Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BadassBison commented Mar 23, 2026 •

edited

Loading

1. `src/grep_cmd.rs` — `--no-ignore` flag

2. `src/grep_cmd.rs` — Passthrough for small results

3. `src/filter.rs` — Clean truncation in `smart_truncate`

`--no-ignore` on rg: tracing the exact code path

CLAassistant commented Mar 23, 2026 •

edited

Loading