Skip to content

Stabilize optimizer scoring and rubric consistency checks#257

Merged
endymion merged 7 commits intodevelopfrom
bugfix/plx-e4f63d-optimizer-candidates-unfeatured
Apr 30, 2026
Merged

Stabilize optimizer scoring and rubric consistency checks#257
endymion merged 7 commits intodevelopfrom
bugfix/plx-e4f63d-optimizer-candidates-unfeatured

Conversation

@endymion
Copy link
Copy Markdown
Contributor

Summary

  • Prevent optimizer/test-created ScoreVersions from being marked featured by default.
  • Add a score/rubric consistency preflight that compares a ScoreVersion's code against its own rubric, stores feedback-evaluation results, and exposes the command as plexus score contradictions.
  • Improve optimizer baseline feedback evaluations so they request the consistency preflight.
  • Improve evaluation/RCA dashboard filtering so category drill-downs can match score result IDs, item IDs, feedback item IDs, metadata IDs, and item identifiers.

Validation

  • pytest plexus/score_rubric_consistency_test.py plexus/cli/score/scores_test.py plexus/cli/procedure/test_feedback_alignment_optimizer_config.py MCP/tools/evaluation/evaluations_test.py -q -> 71 passed
  • flake8 plexus/score_rubric_consistency.py plexus/cli/score/scores.py plexus/cli/item/items.py plexus/cli/evaluation/evaluations.py MCP/tools/evaluation/evaluations.py -> passed
  • cd dashboard && npm test -- --runTestsByPath components/__tests__/EvaluationTask.category-filter.test.tsx --runInBand -> 5 passed
  • kbs validate -> passed
  • Real smoke: plexus score contradictions --scorecard "SelectQuote HCS Medium-Risk" --score "Medication Review: Dosage" --version d730e6a8-164b-4157-b19b-47beb00d7b2e --format json returned potential_conflict.

Notes

  • npm run typecheck initially failed on a corrupted generated .next/dev/types/routes.d.ts; after clearing .next, it ran for several minutes without completing and was stopped. CI should run the clean environment check.

@endymion endymion requested a review from a team as a code owner April 30, 2026 00:01
@endymion endymion requested review from dereknorrbom and removed request for a team April 30, 2026 00:01
@endymion endymion merged commit 63184db into develop Apr 30, 2026
3 of 4 checks passed
@endymion endymion deleted the bugfix/plx-e4f63d-optimizer-candidates-unfeatured branch April 30, 2026 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant