Autonomous multi-agent framework for discovering, entering, and winning Kaggle competitions
AGENT-K is an autonomous multi-agent system that discovers, researches, prototypes, evolves, and submits solutions to Kaggle competitions. The system leverages:
- Pydantic-AI agents with FunctionToolsets (Kaggle, Search, Memory)
- Pydantic-Graph state machine for orchestration
- OpenEvolve framework for evolutionary code search
- Pydantic Logfire for comprehensive observability
- Next.js frontend for real-time mission monitoring
- SAGE Spec structured docstrings and
.sage/metadata for agent + human guidance
┌─────────────────────────────────────────────────────────────────────────────┐
│ LYCURGUS ORCHESTRATOR │
│ (Pydantic-Graph State Machine) │
│ │
│ Discovery -> Research -> Prototype -> Evolution -> Submission │
│ | | | | | │
│ LOBBYIST SCIENTIST baseline EVOLVER adapter submit │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ TOOLING & ADAPTERS │ │
│ │ • Kaggle Toolset (FunctionToolset) │ │
│ │ • Built-in WebSearch/WebFetch │ │
│ │ • MemoryTool + AgentKMemoryTool (Anthropic only) │ │
│ │ • CodeExecutionTool (provider) │ │
│ │ • Kaggle MCP (evolver submissions) │ │
│ │ • Platform Adapters: Kaggle API or OpenEvolve (in-memory) │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
The central orchestrator coordinating the multi-agent competition lifecycle. Implements a state machine using pydantic-graph to manage phase transitions, resource allocation, error recovery, and mission persistence.
Discovers and evaluates Kaggle competitions matching user-specified criteria. Uses web search and Kaggle API to find competitions based on prize pool, deadline, domain alignment, and team constraints.
Conducts comprehensive research including literature review, leaderboard analysis, exploratory data analysis, and strategy synthesis. Identifies winning approaches from similar past competitions.
Evolves solutions using evolutionary code search to maximize competition score. Manages population-based optimization with mutations, crossover, and fitness evaluation.
| Feature | Description |
|---|---|
| Multi-Agent Orchestration | Pydantic-Graph state machine coordinates specialized agents through competition lifecycle |
| Evolutionary Code Search | OpenEvolve integration for population-based solution optimization |
| Kaggle Integration | FunctionToolset-based platform operations with OpenEvolve fallback for offline runs |
| Real-Time Observability | Pydantic Logfire instrumentation for tracing, metrics, and debugging |
| SAGE Spec Documentation | Structured docstrings and .sage/ artifacts for agent navigation and review workflows |
| Interactive Dashboard | Next.js frontend with mission monitoring, evolution visualization, and tool call inspection |
| Memory Persistence | Cross-session context and checkpoint management for long-running missions |
| Error Recovery | Automatic retry, fallback, and replanning strategies for robust execution |
AGENT-K uses the SAGE spec to encode agent guidance and machine-readable metadata directly in
docstrings and type annotations. The upstream spec lives at
github.com/mikewcasale/sage-spec, and a snapshot is
vendored at backend/docs/sage-spec.md.
Key pieces in this repo:
- Docstrings include SAGE tags like
@notice,@dev,@graph,@agent-guidance, and@human-reviewto capture the Contextual Relationship Graph (CRG). - Parameter metadata is co-located with signatures using
typing.AnnotatedplusDoc/Range/Constraint/Defaultfromagent_k.core.sage(backend/agent_k/core/sage.py). - Machine-readable exports live in
backend/.sage/(for examplecomponents.json,canonical-homes.json, anderrors.json) to support tooling and fast lookups without scanning the full tree.
Developer workflow:
- Use the SAGE Docstrings VS Code extension (https://marketplace.visualstudio.com/items?itemName=gcode-ai.sage-vscode-extension) for tag templates and validation.
- Refresh
.sageartifacts before committing:backend/.venv/bin/python backend/tools/sageextract.py --emit(pre-commit runs validation).
AGENT-K executes missions through a 5-phase lifecycle:
┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌────────────┐
│ DISCOVERY │───▶│ RESEARCH │───▶│ PROTOTYPE │───▶│ EVOLUTION │───▶│ SUBMISSION │
└───────────┘ └───────────┘ └───────────┘ └───────────┘ └────────────┘
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
Find and Analyze Build Optimize Submit
validate leaderboard, baseline solution final
competition research, working using ECS solution
EDA solution
- Discovery — Search for competitions matching criteria (prize, deadline, domain), validate accessibility, rank candidates
- Research — Analyze leaderboard distribution, review academic papers and winning solutions, perform EDA, synthesize strategy
- Prototype — Generate baseline solution from research findings, validate execution, establish baseline score
- Evolution — Initialize population, evaluate fitness, apply mutations/crossover, detect convergence, submit checkpoints
- Submission — Generate final predictions, submit to Kaggle, wait for scoring, record final rank
| Component | Technology | Purpose |
|---|---|---|
| Agent Framework | Pydantic-AI | Agent definitions, tool registration, structured outputs |
| Orchestration | Pydantic-Graph | State machine, phase transitions |
| Evolution | OpenEvolve | Evolutionary code search |
| Kaggle API | KaggleToolset | Platform operations |
| Observability | Pydantic Logfire | Tracing, metrics, logging |
| HTTP Client | HTTPX | Async HTTP requests |
| Component | Technology | Purpose |
|---|---|---|
| Framework | Next.js 16 | React server components, routing |
| UI Library | React 19 | Component rendering |
| Protocol | AG-UI | Agent-to-UI event streaming |
| Styling | Tailwind CSS | Utility-first styling |
| Charts | Recharts | Evolution visualization |
| State | SWR | Data fetching and caching |
- Python 3.11+
- uv (Python package manager)
- Node.js 20+
- pnpm
- Kaggle API credentials
cd backend
# Install dependencies with uv
uv sync
# Activate virtual environment (uv creates .venv automatically)
source .venv/bin/activate # or .venv\Scripts\activate on Windows
# Set environment variables (or create backend/.env)
# At least one model provider API key is required
export ANTHROPIC_API_KEY="your-api-key"
# or: OPENROUTER_API_KEY / OPENAI_API_KEY
# Kaggle credentials are required for live competitions
export KAGGLE_USERNAME="your-kaggle-username"
export KAGGLE_KEY="your-kaggle-key"cd frontend
# Install dependencies
pnpm install
# Set environment variables
cp .env.example .env.local
# Edit .env.local with your configuration
# Run development server
pnpm dev# From project root - starts backend (9000) and frontend (3000)
./run.shcd backend
python -m agent_k.ui.aguiRun a mission through the chat endpoint (streams Vercel AI data events):
curl -N -X POST http://localhost:9000/agentic_generative_ui/ \
-H "Content-Type: application/json" \
-d '{"id":"demo","messages":[{"role":"user","parts":[{"type":"text","text":"Find a Kaggle competition with a $10k+ prize"}]}]}'import asyncio
from agent_k import LycurgusOrchestrator
from agent_k.core.models import MissionCriteria
async def main():
async with LycurgusOrchestrator() as orchestrator:
result = await orchestrator.execute_mission(
competition_id="titanic",
criteria=MissionCriteria(
target_leaderboard_percentile=0.10,
max_evolution_rounds=50,
),
)
print(f"Final rank: {result.final_rank}")
print(f"Final score: {result.final_score}")
asyncio.run(main())AGENT-K supports multiple model providers via get_model(). Standard Pydantic-AI model
strings are passed through; devstral: and openrouter: specs resolve to
OpenAI-compatible models.
| Model Spec | Description |
|---|---|
devstral:local |
Local LM Studio server (default: http://192.168.105.1:1234/v1) |
devstral:http://host:port/v1 |
Custom Devstral endpoint |
devstral:mistralai/devstral-small-2-2512 |
Local LM Studio with explicit model id |
anthropic:claude-3-haiku-20240307 |
Claude Haiku via Anthropic |
anthropic:claude-sonnet-4-5 |
Claude Sonnet (backend default) |
openrouter:mistralai/devstral-small-2-2512 |
Devstral via OpenRouter |
openai:gpt-4o |
GPT-4o via OpenAI |
agent-k/
├── backend/
│ ├── .sage/ # SAGE metadata artifacts (CRG exports)
│ ├── agent_k/
│ │ ├── agents/ # Pydantic-AI agent definitions
│ │ │ ├── base.py
│ │ │ ├── evolver.py
│ │ │ ├── lobbyist.py
│ │ │ ├── lycurgus.py
│ │ │ ├── scientist.py
│ │ │ └── prompts.py
│ │ ├── adapters/ # Platform integrations
│ │ │ ├── kaggle.py
│ │ │ └── openevolve.py
│ │ ├── core/ # Domain models and helpers
│ │ │ ├── constants.py
│ │ │ ├── data.py
│ │ │ ├── deps.py
│ │ │ ├── exceptions.py
│ │ │ ├── models.py
│ │ │ ├── protocols.py
│ │ │ ├── settings.py
│ │ │ ├── solution.py
│ │ │ ├── sage.py
│ │ │ └── types.py
│ │ ├── mission/ # State machine
│ │ │ ├── nodes.py
│ │ │ ├── persistence.py
│ │ │ └── state.py
│ │ ├── toolsets/ # FunctionToolset helpers
│ │ │ ├── code.py
│ │ │ ├── kaggle.py
│ │ │ ├── memory.py
│ │ │ ├── search.py
│ │ │ ├── browser.py # Placeholder
│ │ │ └── scholarly.py # Placeholder
│ │ ├── embeddings/ # RAG support
│ │ │ ├── embedder.py
│ │ │ ├── retriever.py
│ │ │ └── store.py
│ │ ├── evals/ # Evaluation framework
│ │ │ ├── datasets.py
│ │ │ ├── evaluators.py
│ │ │ ├── discovery.yaml
│ │ │ ├── evolution.yaml
│ │ │ └── submission.yaml
│ │ ├── infra/ # Infrastructure
│ │ │ ├── config.py
│ │ │ ├── instrumentation.py
│ │ │ ├── logging.py
│ │ │ └── providers.py
│ │ └── ui/ # AG-UI protocol (FastAPI)
│ │ └── agui.py
│ ├── cli.py # FastAPI app entrypoint
│ ├── docs/ # Backend docs (mkdocs + logo.png)
│ └── tests/
│
├── frontend/
│ ├── app/ # Next.js app router
│ │ ├── (auth)/ # Authentication
│ │ └── (chat)/ # Chat interface
│ ├── components/
│ │ ├── agent-k/ # Mission dashboard
│ │ │ ├── mission-dashboard.tsx
│ │ │ ├── phase-card.tsx
│ │ │ ├── evolution-view.tsx
│ │ │ ├── fitness-chart.tsx
│ │ │ └── tool-call-card.tsx
│ │ └── ui/ # Shared UI components
│ ├── hooks/ # React hooks
│ │ └── use-agent-k-state.tsx # Mission state hook
│ └── lib/
│ ├── ai/ # Model configuration
│ │ └── models.ts # Available chat models
│ └── types/
│ └── agent-k.ts # TypeScript types
│
├── run.sh # Start both servers
└── render.yaml # Render deployment config
| Variable | Description | Required |
|---|---|---|
KAGGLE_USERNAME |
Kaggle account username | Yes* |
KAGGLE_KEY |
Kaggle API key | Yes* |
ANTHROPIC_API_KEY |
Anthropic API key for Claude models | Yes** |
OPENROUTER_API_KEY |
OpenRouter API key | Yes** |
OPENAI_API_KEY |
OpenAI API key | Yes** |
DEVSTRAL_BASE_URL |
Local LM Studio endpoint (default: http://192.168.105.1:1234/v1) |
No |
LOGFIRE_TOKEN |
Pydantic Logfire token | No |
AGENT_K_MEMORY_DIR |
Memory tool storage path | No |
| Variable | Description | Required |
|---|---|---|
AUTH_SECRET |
Auth.js signing secret | Yes |
AUTH_TRUST_HOST |
Required behind reverse proxies (Render, etc.) | Conditional |
AUTH_URL |
Base URL for Auth.js callbacks | Yes |
POSTGRES_URL |
PostgreSQL connection string | Yes |
PYTHON_BACKEND_URL |
Agent K backend SSE endpoint | Yes (Agent K) |
ANTHROPIC_API_KEY |
Claude models for in-app chat | Optional |
BLOB_READ_WRITE_TOKEN |
Vercel Blob storage for uploads | Optional |
REDIS_URL |
Redis cache | Optional |
*Required for Kaggle platform access. If absent, the orchestrator falls back to OpenEvolve.
**At least one model provider API key is required.
from agent_k.core.models import MissionCriteria, CompetitionType
criteria = MissionCriteria(
target_competition_types=frozenset({
CompetitionType.FEATURED,
CompetitionType.RESEARCH,
}),
min_prize_pool=10000,
min_days_remaining=14,
target_domains=frozenset({"computer_vision", "nlp"}),
max_evolution_rounds=100,
target_leaderboard_percentile=0.10,
)AGENT-K uses Pydantic Logfire for comprehensive observability:
from agent_k.infra.instrumentation import configure_instrumentation
configure_instrumentation(
service_name="agent-k",
environment="production",
)- Phase completion times and success rates
- Evolution generation fitness progression
- Tool call latency and error rates
- Kaggle submission scores and rankings
- API token usage and costs
# Backend tests
cd backend
uv run pytest -v
# Run specific test
uv run pytest tests/test_file.py::test_name -v
# Frontend E2E tests
cd frontend
pnpm test:e2e# Backend linting
cd backend
uv run ruff check .
uv run ruff format .
uv run mypy .
# Frontend linting (uses Ultracite)
cd frontend
pnpm lint
pnpm formatcd backend
uv run ruff format .
uv run ruff check .
backend/.venv/bin/python backend/tools/sageextract.py --emit
backend/.venv/bin/python backend/tools/sageextract.py --validate
uv run pytest -v --tb=shortDeploys to Render via render.yaml:
- Backend: FastAPI on port 9000 (
agent-k-api) - Frontend: Next.js on port 3000 (
agent-k-frontend) - Database: PostgreSQL (
agent-k-postgres)
Environment variables are set in Render's agent-k-secrets group.
Contributions are welcome! Please read our contributing guidelines and submit pull requests to the main branch.
This project is licensed under the MIT License - see the LICENSE file for details.
- Pydantic for the excellent AI framework and observability tools
- Kaggle for the competition platform
- OpenEvolve for evolutionary code search inspiration
- SAGE Spec for structured agent guidance metadata
