Skip to content

simonpierreboucher02/agent-lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧪 AI Lab Playground

Multi-Model LLM Arena & Research Workbench

TypeScript React Vite TailwindCSS Node.js Express SQLite OpenRouter

License PRs Welcome Maintenance



A professional-grade AI research workbench for testing, comparing, and benchmarking LLMs via OpenRouter.

Battle models head-to-head, run speed contests, orchestrate multi-agent debates, visualize token probabilities, and build your own ELO leaderboard — all from a sleek, mobile-first interface.

Features · Quick Start · Architecture · Arena · Authors


📊 Key Metrics

Metric Value
Chat, Models, Runs, Experiments, Agents, Memory, Multi-Agent
Real-time streaming with TTFB & tok/s tracking
Battle, Debate, Speed Race, ELO Leaderboard
Round-Robin, Coordinator+Specialists, Critic/Refiner
Runs, Experiments, Results, Agents, Memory, Models Cache
Playground, Arena, Experiments, Agents, Models, History
Optimized production build
Mobile-first responsive design

✨ Features

🏟️ Arena — Multi-Model Comparison

  • Battle Mode — Same prompt sent to 2+ models simultaneously, side-by-side streaming, blind voting with ELO rating updates
  • Debate Mode — Two models argue a topic across configurable rounds, each seeing and countering the other's arguments
  • Speed Race — Real-time animated progress bars, TTFB/tok/s/total time comparison, podium finish (🥇🥈🥉)
  • ELO Leaderboard — Persistent rankings with win/loss/draw stats, win rate tracking, full battle history

💬 Playground Chat

  • Real-time SSE streaming with live Markdown rendering
  • Full GFM support: code blocks with copy button, tables, lists, blockquotes
  • Model selector with 300+ models, search, favorites
  • Temperature, Top P, Max Tokens, Frequency/Presence Penalty sliders
  • Run ID tracking, TTFB, tokens/sec metrics
  • JSON export per run

🔬 Logprob Heatmap

  • Color-coded token probability visualization (green → red)
  • Floating tooltip with probability bar, logprob value, and top-5 alternative tokens
  • Summary statistics: average, min, max probability with visual bars
  • Touch-friendly for mobile

🧪 Experiment Runner

  • Multi-model × multi-temperature parameter grid
  • Streamed execution with live results
  • Comparative results view with metrics per configuration

🤖 Multi-Agent Sandbox

  • Create agents with name, role, model, system prompt, memory policy
  • Three orchestration modes with real-time streaming
  • Memory system: facts, summaries, pin/purge

📦 Model Browser

📜 Run History


🚀 Quick Start

Prerequisites

Installation

# Clone the repository
git clone https://github.com/simonpierreboucher02/agent-lab.git
cd agent-lab

# Install dependencies
npm install

# Configure environment
echo "OPENROUTER_API_KEY=sk-or-v1-your-key-here" > .env
echo "PORT=3001" >> .env

# Start development servers
npm run dev
Service URL
http://localhost:5173
http://localhost:3001

Production Build

npm run build

🏗️ Architecture

ai-lab-playground/
├── .env                              # OpenRouter API key (server-side only)
├── package.json                      # Monorepo workspaces
├── packages/
│   ├── server/                       # Backend — Node + TypeScript + Express
│   │   └── src/
│   │       ├── index.ts              # Server entry (port 3001)
│   │       ├── db/                   # SQLite with WAL mode
│   │       ├── services/             # OpenRouter HTTPS streaming
│   │       ├── routes/
│   │       │   ├── chat.ts           # POST /v1/chat/stream (SSE)
│   │       │   ├── models.ts         # GET /v1/models (cached proxy)
│   │       │   ├── runs.ts           # GET/DELETE /v1/runs
│   │       │   ├── experiments.ts    # CRUD + SSE run
│   │       │   ├── agents.ts         # CRUD agents
│   │       │   ├── memory.ts         # CRUD + purge
│   │       │   └── multiagent.ts     # POST /v1/multiagent/run (SSE)
│   │       └── types/
│   └── web/                          # Frontend — React + Vite + Tailwind
│       └── src/
│           ├── components/
│           │   ├── arena/            # BattleMode, DebateMode, SpeedMode, ELO
│           │   ├── ChatMessage.tsx    # Markdown-rendered chat bubbles
│           │   ├── MarkdownRenderer.tsx
│           │   ├── LogprobHeatmap.tsx # Token probability visualization
│           │   ├── ModelSelector.tsx
│           │   ├── ParamsPanel.tsx
│           │   └── MetricsBar.tsx
│           ├── pages/                # Playground, Arena, Experiments, Agents, Models, History
│           ├── stores/               # Zustand (appStore + arenaStore)
│           ├── hooks/                # useStream (SSE client)
│           └── lib/                  # API client

Tech Stack

Layer Technology Badge
Frontend React 18
Build Vite 5
Styling TailwindCSS 3
State Zustand 4
Data TanStack Query 5
Markdown react-markdown + remark-gfm
Charts Recharts
Icons Lucide React
Backend Express 4
Runtime Node.js
Database SQLite (better-sqlite3)
LLM API OpenRouter
Language TypeScript 5

🔌 API Endpoints

Method Endpoint Description Badge
GET /v1/models List OpenRouter models (1h cache)
POST /v1/chat/stream SSE streaming chat completion
GET /v1/runs List run history
GET /v1/runs/:id Get run detail
POST /v1/experiments Create experiment
POST /v1/experiments/:id/run Run experiment (SSE)
GET/POST/DELETE /v1/agents CRUD agents
GET/POST/DELETE /v1/memory CRUD memory items
POST /v1/multiagent/run Multi-agent orchestration (SSE)
GET /health Health check

🔒 Security

  • OpenRouter API key is never exposed to the frontend
  • .env is in .gitignore
  • All API calls proxied through the backend
  • CORS configured for development

📱 Responsive Design

Breakpoint Layout
Mobile (< 768px) Bottom nav, drawer panels, touch-optimized
Tablet (768px+) Side panels, expanded controls
Desktop (1024px+) Full 3-column layout with persistent panels

👥 Authors

Author Role Links
🧑‍🔬 Simon-Pierre Boucher Creator & Lead Developer Website Email GitHub
🤖 Claude Opus 4.6 Co-Author & AI Engineer Anthropic

📄 License

This project is licensed under the MIT License.



www.spboucher.ai · spbou4@protonmail.com

Built with precision and care — 2026

About

AI Lab Playground — Multi-Model LLM Arena & Research Workbench powered by OpenRouter

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages