Skip to content

Conversation

@dtsong
Copy link
Owner

@dtsong dtsong commented Jan 24, 2026

Closes #42

Changes

  • dashboard/src/app/evals/page.tsx - Eval run history with filtering/sorting, summary cards, and charts
  • dashboard/src/components/EvalChart.tsx - Precision/recall trends, cost efficiency, issue category breakdown, confidence calibration
  • dashboard/src/components/ModelComparison.tsx - Quality vs cost scatter, model comparison bars, summary table with recommendations
  • dashboard/src/lib/supabase.ts - Added EvalRun and EvalIssueBreakdown interfaces
  • dashboard/src/app/page.tsx - Added nav link to evals page

Features

  • Eval run history table with sortable columns and filters (model, eval suite)
  • Precision/Recall/F1 trend charts over time
  • Cost efficiency analysis (cost per case vs F1)
  • Confidence calibration chart (predicted confidence vs actual accuracy)
  • Per-category issue performance breakdown
  • Model comparison: bar chart, quality vs cost scatter, summary table with recommendations
  • Mobile-responsive grid layout

Testing

  • TypeScript compiles with tsc --noEmit (zero errors)
  • Follows existing dashboard patterns (Recharts, inline styles, Supabase client)

🤖 Generated with Claude Code

…omparison

Implements #42

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dtsong dtsong merged commit b3dec29 into main Jan 24, 2026
2 checks passed
@dtsong dtsong deleted the feat/42-eval-insights-dashboard branch January 24, 2026 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval insights dashboard

2 participants