LLM Eval Viewer

This project was built entirely with AI assistance (Baidu Comate IDE + Claude Opus 4.6).

LLM Eval Viewer is a lightweight web tool for visualizing LLM evaluation results.
Currently supports result formats generated by evalscope.

Live Demo:
https://dynamicheart.github.io/llm-eval-viewer/

Features

Multi-format support: Evalscope Predictions / Reviews evaluation results
Directory browsing: Select a directory to auto-scan its structure, quickly switch between experiments and datasets (Chrome/Edge)
Statistics & distribution: Token histogram, result/finish-reason distribution, per-dataset accuracy — all interactive with click-to-filter
Reasoning support: Displays reasoning content marked as [R], view Text and Reasoning separately
Dark mode: Light / Dark / Auto theme with system preference detection
cURL export: Generate cURL commands from request details for quick API replay
i18n: English and Chinese language support

Example Files (evalscope)

You can use the following example files for local or online experience:

Predictions
- math_500_level_1_predictions.jsonl
Reviews
- math_500_level_1_reviews.jsonl
Predictions (with reasoning)
- humaneval_predictions_with_reasoning.jsonl
Reviews (with reasoning)
- humaneval_reviews_with_reasoning.jsonl

Screenshots

Reviews View (Dark Mode)

Predictions View (Light Mode)

Development

cd llm-eval-viewer
npm install
npm run dev

Build

npm run build

Built with AI

This project was developed entirely through AI-assisted programming using Baidu Comate IDE with Claude Opus 4.6 as the agent model. From architecture design to implementation, all code was generated and iterated via human-AI collaboration.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.github/workflows		.github/workflows
docs		docs
llm-eval-viewer		llm-eval-viewer
scripts/format		scripts/format
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Eval Viewer

Features

Example Files (evalscope)

Screenshots

Reviews View (Dark Mode)

Predictions View (Light Mode)

Development

Build

Built with AI

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Eval Viewer

Features

Example Files (evalscope)

Screenshots

Reviews View (Dark Mode)

Predictions View (Light Mode)

Development

Build

Built with AI

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages