Skip to content

Conversation

@fmueller
Copy link
Owner

Motivation

  • The feedback LLM was not reliably constrained by the --focus flag and needed to be guided to only evaluate requested categories while still surfacing critical issues from other categories.
  • Support comma-separated multiple focus categories so users can request reviews across more than one category in a single run.

Description

  • Added FeedbackFocus.parse_list to parse comma-separated focus values and accept multiple normalized categories, and added ALL as a constant for completeness.
  • Propagated a list[str] | None focus type through prepare_context/FeedbackContext and the prompt-building flow so prompts receive the parsed list of categories directly.
  • Updated the user/system feedback prompts in prompts/feedback.py to include an In-scope categories line, per-category definitions limited to selected categories, and a Critical override instruction to always include high-severity issues from any category.
  • Adjusted the CLI help and parsing in feedback_cli.py to accept comma-separated values and pass the parsed list to prepare_context, and added a dry-run unit test test_feedback_focus_multiple_categories_limits_prompt to validate prompt rendering.

Testing

  • Ran uv run pytest and all unit tests passed (154 passed).
  • Ran uv run ruff check and linting passed (All checks passed!).
  • Ran uv run mypy which reported unrelated type errors due to missing torch stubs in src/scribae/translate/mt.py and therefore returned errors; this is an existing environmental typing issue not introduced by these changes.

Codex Task

fmueller and others added 10 commits January 23, 2026 14:06
- Move CATEGORY_DEFINITIONS to prompts/feedback_categories.py as
  single source of truth
- Remove unused FeedbackFocus.ALL constant and dead from_raw() method
- Add FeedbackFocus.ALLOWED derived from CATEGORY_DEFINITIONS keys
- Update imports in feedback.py and prompts/feedback.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When --focus is provided, the JSON schema's findings.category field now
only lists the selected categories plus "other" (for critical overrides)
instead of all categories. This reduces LLM confusion.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add explicit negative instruction: "ONLY use category values listed
  in the schema. DO NOT use any other category values."
- Add _normalize_finding_categories() to remap any out-of-scope
  categories to "other" after LLM returns
- Add tests for the normalization function

This belt-and-suspenders approach ensures LLMs cannot output categories
outside the focus scope, even if they hallucinate from training data.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Cast category string to Any to satisfy mypy's Literal type
requirement for FeedbackFinding.category.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add local mypy hook that runs via uv to use project dependencies.
Fix pre-existing torch import type errors in mt.py with ignore comments.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add torch to mypy ignore_missing_imports override and use cast
instead of type: ignore to avoid unused-ignore errors in CI.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove torch-specific mypy workarounds since torch is now required
for development. Update documentation to clarify that --all-extras
is mandatory for mypy to pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename SelectedOutlineRange to SectionsUnderReview for clarity
- Omit line entirely when all sections are selected (redundant)
- Add explicit instruction for brief alignment checking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@fmueller fmueller merged commit c0f40c7 into main Jan 23, 2026
3 checks passed
@fmueller fmueller deleted the codex/fix-focus-parameter-for-feedback-command branch January 23, 2026 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants