feat(leaderboard): ghost performance leaderboard and A/B testing by Enreign · Pull Request #79 · emberloom/sparks

Enreign · 2026-03-16T23:54:13Z

Description

Adds a performance leaderboard for ghost profiles with SQLite-backed tracking and A/B testing support.

Changes

src/leaderboard.rs — GhostLeaderboard with outcome recording, rankings, A/B routing, auto-promotion recommendations, and head-to-head comparison
src/config.rs — LeaderboardConfig with ab_test_ghost, ab_test_fraction, min_samples_for_recommendation, promotion_threshold
src/main.rs — sparks leaderboard [show|compare <a> <b>|reset]
config.example.toml — [leaderboard] section

Features

Leaderboard

Ranks all ghost profiles by composite score: 60% success rate + 20% user rating + 20% token efficiency.

A/B Testing

Route ab_test_fraction (default: 10%) of requests to a challenger ghost, then automatically recommend promotion when the challenger outperforms the control by promotion_threshold (default: 10%) over min_samples (default: 50) tasks.

Comparison

sparks leaderboard compare coder architect

Type of Change

New feature

Pre-PR Checklist

cargo check -q passes
cargo test -q passes (398 tests, 0 failed)

- GhostLeaderboard: SQLite-backed outcome tracking per ghost profile - TaskOutcome: records success, latency, token usage, user rating (-1/0/1) - GhostMetrics: aggregate stats with composite rank_score (60% success, 20% user rating, 20% token efficiency) - A/B testing: route configurable fraction of requests to a challenger ghost - Auto-promotion: recommend promoting challenger when it outperforms control by promotion_threshold (default: 10%) over min_samples (default: 50) - ASCII leaderboard with star ratings and head-to-head comparison - 'sparks leaderboard [show|compare <a> <b>|reset]' CLI subcommand - 7 unit tests covering recording, ranking, A/B routing, promotion check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add TaskOutcome::new() convenience constructor to reduce struct literal verbosity in tests - Fix compare() output: add column headers (ghost names + separator) so each column is identifiable - Use successful_tasks in format_row() ("{success}/{total}") to eliminate dead-field warning - Add four missing tests: ab_route with fraction=1.0, rank_score with zero tokens, reset(), and format_leaderboard with data - Add [leaderboard] section to config.example.toml with all fields documented Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Enreign force-pushed the feat/ghost-leaderboard branch 2 times, most recently from 3d7ee0d to d6578ff Compare March 18, 2026 22:11

Enreign and others added 2 commits March 18, 2026 23:14

Enreign force-pushed the feat/ghost-leaderboard branch from d6578ff to 7f76e4f Compare March 18, 2026 22:14

Enreign merged commit 0d066b4 into main Mar 18, 2026
4 checks passed

Enreign deleted the feat/ghost-leaderboard branch March 18, 2026 22:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(leaderboard): ghost performance leaderboard and A/B testing#79

feat(leaderboard): ghost performance leaderboard and A/B testing#79
Enreign merged 2 commits intomainfrom
feat/ghost-leaderboard

Enreign commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Enreign commented Mar 16, 2026

Description

Changes

Features

Leaderboard

A/B Testing

Comparison

Type of Change

Pre-PR Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant