test: F-RY-001 file-set validation coverage#71
test: F-RY-001 file-set validation coverage#71Nelson Spence (Fieldnote-Echo) wants to merge 1 commit intomainfrom
Conversation
…erage (F-RY-001) Add 4 integration tests exercising expected_rule_files through the run_review() retry loop. Proves that file-set validation (retry.py:102-105) triggers retries when LLM findings reference fabricated files, and accepts partial file overlap (set intersection semantics). Tests: - test_retry_on_fabricated_files: wrong file → retry → correct file - test_no_retry_when_files_match: correct file → single call - test_any_file_overlap_sufficient: boundary — partial overlap passes - test_warns_on_exhausted_retries_with_fabricated_files: all wrong → warning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
project-navi-bot
left a comment
There was a problem hiding this comment.
All required CI checks passed. Auto-approved by navi-bot.
project-navi-bot
left a comment
There was a problem hiding this comment.
All required CI checks passed. Auto-approved by navi-bot.
| assert isinstance(result, GrippyReview) | ||
| assert agent.run.call_count == 2 | ||
|
|
||
|
|
There was a problem hiding this comment.
🟡 MEDIUM: Direct usage of 'VALID_REVIEW_DICT' placeholder may hinder test clarity
Confidence: 90%
The integration tests rely on a pre-existing 'VALID_REVIEW_DICT' variable to generate review findings for test cases. If the contents or shape of 'VALID_REVIEW_DICT' are edited elsewhere (not shown in this diff), it could impact all tests here, potentially masking breakages or yielding misleading positives.
Suggestion: Consider explicitly defining a minimal, static test review dictionary within this test module, or document/control its sourcing to ensure test isolation and future maintainability.
— Test fixtures hidden behind shared global state usually come back to haunt people during future refactors.
|
|
||
| # --- File-set validation retry integration (F-RY-001) --- | ||
|
|
||
|
|
There was a problem hiding this comment.
🔵 LOW: Test helper function lacks explicit doc coverage for parameter/return shapes
Confidence: 85%
The _review_dict_with_file_findings function builds synthetic review results but doesn't specify the structure of the expected input or returned value beyond the code sample. While not fatal for test coverage, enriching the docstring for future maintainers would improve code readability.
Suggestion: Extend the function docstring to clarify expected inputs and outputs for _review_dict_with_file_findings, or add example usages inline.
— A bit more doc here wouldn't hurt. Tests are only as clear as their helpers.
|
|
||
| Exercises the expected_rule_files parameter through the full retry loop, | ||
| proving that file-set validation (retry.py:102-105) triggers retries | ||
| and correction prompts when the LLM fabricates file references. |
There was a problem hiding this comment.
🔵 LOW: Synthetic tests do not include negative test for multiple fabricated file references
Confidence: 80%
While the tests cover fabricated single-file and multi-file overlap/partial match behaviors, there is no explicit test where findings reference multiple unexpected fabricated files. Adding such a case would tighten regression safety for file-set validation edge cases.
Suggestion: Add an additional test case where the findings field includes multiple fabricated files to confirm the retry branch still functions as intended.
— Edge-case negative tests don't win sprints, but they'll save debugging hours down the road.
✅ Grippy Review — PASSScore: 93/100 | Findings: 3 Delta: 3 new Commit: 7032909 |
project-navi-bot
left a comment
There was a problem hiding this comment.
All required CI checks passed. Auto-approved by navi-bot.
project-navi-bot
left a comment
There was a problem hiding this comment.
All required CI checks passed. Auto-approved by navi-bot.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
project-navi-bot
left a comment
There was a problem hiding this comment.
All required CI checks passed. Auto-approved by navi-bot.
Summary
_validate_rule_coverage()file-set validation branch (retry.py:102-105)run_review()withexpected_rule_filesthrough the full retry loopAudit Context
Test plan
uv run pytest tests/test_grippy_retry.py -v— all passuv run pytest tests/ -v— full suite passes (1180 passed)🤖 Generated with Claude Code