Skip to content

feat: Add multi-language translation leaderboard system#65

Closed
Josh9281 wants to merge 15 commits intomainfrom
leaderboard-on-main
Closed

feat: Add multi-language translation leaderboard system#65
Josh9281 wants to merge 15 commits intomainfrom
leaderboard-on-main

Conversation

@Josh9281
Copy link
Collaborator

No description provided.

- Add core leaderboard components (Leaderboard.js, SubmitToLeaderboard.js)
- Add leaderboard API endpoints (get_leaderboard)
- Add evaluation functions (BLEU, BERTScore)
- Add test files and documentation in docs/leaderboard/
- Add minimal required dependencies
- No changes to existing DevOps, agents, or infrastructure
@nv78

This comment was marked as outdated.

@Josh9281 Josh9281 force-pushed the leaderboard-on-main branch from 095de0d to 00bdef4 Compare August 31, 2025 22:11
- Remove hardcoded dataset sorting, use dynamic Object.values approach
- Fix backend port mapping to 5001 to avoid macOS AirPlay conflict
- Update frontend API URLs to use environment variables
- Add missing /public/submit_model endpoint to backend
- Create missing test files for all language/metric combinations
- Update documentation to reflect port change from 3001 to 3000
- Fix CORS configuration for proper frontend-backend communication
- Remove leaderboard route from RouteConstants.js and Dashboard.js
- Delete Leaderboard.js component completely
- Move FAQs section to Evaluations.js
- Remove 'View all models' button - now shows all models directly
- Simplify to single /evaluations page with all functionality
- Show only top 5 models by default to keep UI clean
- Add expand/collapse functionality to view all models in-place
- Remove messy full model list display
- Maintain original clean UI design
- Add back /leaderboard route and simple Leaderboard.js component
- Evaluations page shows only top 5 models (clean UI)
- 'View all X models →' button navigates to dedicated leaderboard page
- Leaderboard page shows full list for selected dataset only
- FAQs remain on evaluations page
- Maintains clean UI separation
@Josh9281 Josh9281 requested a review from birongliu September 1, 2025 20:21
- Add benchmark_datasets table with translation datasets
- Add model_submissions table for storing model results
- Add evaluation_results table for storing scores
- Insert initial dataset entries for Spanish, Arabic, Japanese, Chinese, Korean (BLEU + BERTScore)
- Add proper indexes for performance

Fixes: Table 'agents.benchmark_datasets' doesn't exist error
- Add evaluation_details column to evaluation_results table
- Fix evaluation_metric values to be lowercase (bleu, bertscore)
- Verify model submission and leaderboard API working correctly
- Database tables now properly created and functional
- Add NLTK punkt download to Dockerfile for BLEU calculation
- Add curl to Dockerfile for healthcheck functionality
- Make OpenAI/Stripe API key initialization more robust with fallbacks
- Add comprehensive .env example file (env-example.txt)
- Update documentation with troubleshooting section
- Add specific guidance for new project setup and database recreation
- Prevent startup failures when API keys are missing
- Update backend API to include empty datasets with is_empty flag
- Update frontend to handle and display empty datasets with 'No submissions yet' message
- Ensure all 10 leaderboards (Spanish, Arabic, Japanese, Chinese, Korean × BLEU/BERTScore) are always visible
- Remove placeholder Google Scripts URL that was never implemented
- Clean up non-functional code that was creating unnecessary network requests
- Keep only the working backend API submission functionality
@nv78
Copy link
Collaborator

nv78 commented Oct 6, 2025

this code isn't relevant to this github repo.

@nv78 nv78 closed this Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants