benchflow-ai · xdotli · Apr 25, 2026 · Apr 21, 2026 · Apr 21, 2026 · Apr 21, 2026
diff --git a/.claude/skills/benchflow/SKILL.md b/.claude/skills/benchflow/SKILL.md
@@ -23,7 +23,7 @@ Arguments passed: `$ARGUMENTS`
 
 ### No args or `status` — show current state
 
-1. Check if benchflow is installed: `pip show benchflow`
+1. Check if benchflow is installed: `uv tool list | grep benchflow`
 2. Check if `.env` exists with API keys
 3. Check available agents: `benchflow agents`
 4. Show recent job results if any exist in `jobs/`
@@ -199,7 +199,7 @@ asyncio.run(main())
 ## Setup
 
 ```bash
-pip install benchflow    # or: pip install -e . (from source)
+uv tool install benchflow    # or: uv tool install -e . (from source)
 source .env              # ANTHROPIC_API_KEY, DAYTONA_API_KEY
 ```
 

diff --git a/.claude/skills/benchflow/tasks/benchflow-knowledge/environment/benchflow/SKILL.md b/.claude/skills/benchflow/tasks/benchflow-knowledge/environment/benchflow/SKILL.md
@@ -23,7 +23,7 @@ Arguments passed: `$ARGUMENTS`
 
 ### No args or `status` — show current state
 
-1. Check if benchflow is installed: `pip show benchflow`
+1. Check if benchflow is installed: `uv tool list | grep benchflow`
 2. Check if `.env` exists with API keys
 3. Check available agents: `benchflow agents`
 4. Show recent job results if any exist in `jobs/`
@@ -182,7 +182,7 @@ asyncio.run(main())
 ## Setup
 
 ```bash
-pip install benchflow    # or: pip install -e . (from source)
+uv tool install benchflow    # or: uv tool install -e . (from source)
 source .env              # ANTHROPIC_API_KEY, DAYTONA_API_KEY
 ```
 

diff --git a/.claude/skills/benchflow/tasks/create-simple-task/environment/benchflow/SKILL.md b/.claude/skills/benchflow/tasks/create-simple-task/environment/benchflow/SKILL.md
@@ -23,7 +23,7 @@ Arguments passed: `$ARGUMENTS`
 
 ### No args or `status` — show current state
 
-1. Check if benchflow is installed: `pip show benchflow`
+1. Check if benchflow is installed: `uv tool list | grep benchflow`
 2. Check if `.env` exists with API keys
 3. Check available agents: `benchflow agents`
 4. Show recent job results if any exist in `jobs/`
@@ -182,7 +182,7 @@ asyncio.run(main())
 ## Setup
 
 ```bash
-pip install benchflow    # or: pip install -e . (from source)
+uv tool install benchflow    # or: uv tool install -e . (from source)
 source .env              # ANTHROPIC_API_KEY, DAYTONA_API_KEY
 ```
 

diff --git a/README.md b/README.md
@@ -21,10 +21,10 @@ BenchFlow runs AI agents against benchmark tasks in sandboxed environments. It s
 ## Install
 
 ```bash
-pip install benchflow==0.3.0a3
+uv tool install benchflow
 ```
 
-Requires Python 3.12+. For cloud sandboxes, set `DAYTONA_API_KEY`.
+Requires Python 3.12+ and [uv](https://docs.astral.sh/uv/). For cloud sandboxes, set `DAYTONA_API_KEY`.
 
 ## Quick Start
 

diff --git a/docs/api-reference.md b/docs/api-reference.md
@@ -5,7 +5,7 @@ The Trial/Scene API is the primary way to run agent benchmarks programmatically.
 ## Install
 
 ```bash
-pip install benchflow==0.3.0a3
+uv tool install benchflow
 ```
 
 ## Quick Start

diff --git a/docs/quickstart.md b/docs/quickstart.md
@@ -4,14 +4,14 @@ Get a benchmark result in under 5 minutes.
 
 ## Prerequisites
 
-- Python 3.12+
+- Python 3.12+ and [uv](https://docs.astral.sh/uv/)
 - A Daytona API key (`DAYTONA_API_KEY`) for cloud sandboxes
 - An agent API key (e.g. `GEMINI_API_KEY` for Gemini)
 
 ## Install
 
 ```bash
-pip install benchflow==0.3.0a3
+uv tool install benchflow
 ```
 
 ## Run your first evaluation

diff --git a/docs/skill-eval-guide.md b/docs/skill-eval-guide.md
@@ -5,7 +5,7 @@ Test whether your agent skill actually helps agents perform better.
 ## Install
 
 ```bash
-pip install benchflow==0.3.0a3
+uv tool install benchflow
 ```
 
 ## Overview
@@ -382,7 +382,7 @@ BenchFlow generates everything ephemeral — only results persist.
 **CI integration:**
 ```bash
 # In your skill's CI pipeline
-pip install benchflow==0.3.0a3
+uv tool install benchflow
 bench skills eval . -a claude-agent-acp --no-baseline
 # Exit code 1 if any case scores < 0.5
 ```