Run 400B+ open coding models on your codebase — without the hardware bill.
Ollama Cloud first. OpenAI, Anthropic, and Gemini when you need them.
npx nex-code
If this saves you time, a star helps others find it.
nex-code-demo-0-4-22.mov
npx nex-code
# or install globally:
npm install -g nex-code && cd ~/your-project && nex-codeOn first launch, an interactive setup wizard guides you through provider and credential configuration. Re-run anytime with /setup.
Ollama Cloud first. Built and optimized for Ollama Cloud — the flat-rate platform running devstral, Kimi K2, Qwen3-Coder, and 47+ models. Other providers (OpenAI, Anthropic, Gemini) work via the same interface.
| Feature | nex-code | Closed-source alternatives |
|---|---|---|
| Free tier | Ollama Cloud flat-rate | subscription or limited quota |
| Open models | devstral, Kimi K2, Qwen3 | vendor-locked |
| Local Ollama | yes | no |
| Multi-provider | swap with one env var | no |
| VS Code sidebar | built-in | partial |
| Startup time | ~100ms | 1-4s |
| Runtime deps | 2 | heavy |
| Infra tools | SSH, Docker, K8s built-in | no |
| <<<<<<< Updated upstream |
Smart model routing. The built-in /benchmark tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.
Phase-based execution. Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.
45 built-in tools across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See Tools for the full list.
=======
Smart model routing. The built-in /benchmark tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.
Phase-based execution. Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.
45 built-in tools across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See Tools for the full list.
Stashed changes 2 runtime dependencies (
axios,dotenv). Starts in ~100ms. No Python, no heavy runtime.
Rankings from nex-code's own /benchmark — 62 tasks testing tool selection, argument validity, and schema compliance.
| Rank | Model | Score | Avg Latency | Context | Best For |
|---|---|---|---|---|---|
| 🥇 | qwen3-vl:235b |
79 | 12.4s | 131K | Overall #1 — frontier tool selection, data + agentic tasks |
| 🥈 | qwen3-vl:235b-instruct |
78.2 | 5.3s | 131K | Best latency/score balance — recommended default |
| 🥉 | nemotron-3-super |
78.1 | 3.5s | 256K | — |
| — | rnj-1:8b |
77.4 | 3.9s | 131K | — |
| — | mistral-large-3:675b |
76.5 | 3.9s | 131K | — |
| — | gpt-oss:20b |
76.5 | 1.9s | 131K | Fast small model, good overall score |
| — | qwen3-coder-next |
75.7 | 2.2s | 256K | — |
| — | qwen3-next:80b |
75.1 | 11.1s | 131K | — |
| — | ministral-3:8b |
73.8 | 2.0s | 131K | Fastest strong model — 2.2s latency, 70+ score |
| — | deepseek-v3.1:671b |
73.6 | 2.9s | 131K | — |
| — | devstral-2:123b |
73.2 | 2.0s | 131K | Sysadmin + SSH tasks, reliable coding |
| — | kimi-k2:1t |
72.2 | 5.6s | 256K | Large repos (>100K tokens) |
| — | ministral-3:3b |
72 | 1.6s | 32K | — |
| — | devstral-small-2:24b |
71.7 | 2.6s | 131K | Fast sub-agents, simple lookups |
| — | qwen3.5:397b |
70.7 | 4.2s | 256K | — |
| — | qwen3-coder:480b |
70.1 | 6.0s | 131K | Heavy coding sessions, large context |
| — | minimax-m2.1 |
69.9 | 3.0s | 200K | — |
| — | gemma4:31b |
69.3 | 2.8s | ? | — |
| — | glm-4.7 |
69.1 | 5.3s | 131K | — |
| — | kimi-k2-thinking |
69 | 3.1s | 256K | — |
| — | ministral-3:14b |
68.8 | 2.0s | 131K | — |
| — | kimi-k2.5 |
68.7 | 3.4s | 256K | Large repos — faster than k2:1t |
| — | minimax-m2.7 |
68.4 | 5.5s | 200K | — |
| — | glm-4.6 |
67.8 | 4.7s | 131K | — |
| — | glm-5 |
67.4 | 5.0s | 131K | — |
| — | gpt-oss:120b |
64.8 | 3.4s | 131K | — |
| — | nemotron-3-nano:30b |
64.7 | 2.3s | 131K | — |
| — | minimax-m2.5 |
61.9 | 2.7s | 131K | Multi-agent, large context |
| — | minimax-m2 |
60.6 | 4.3s | 200K | — |
Rankings are nex-code-specific: tool name accuracy, argument validity, schema compliance. Toolathon (Minimax SOTA) measures different task types — run
/benchmark --discoverafter model releases.
Recommended .env:
DEFAULT_PROVIDER=ollama
DEFAULT_MODEL=devstral-2:123b
NEX_HEAVY_MODEL=qwen3-coder:480b
NEX_STANDARD_MODEL=devstral-2:123b
NEX_FAST_MODEL=devstral-small-2:24bPrerequisites: Node.js 18+ and at least one API key (or local Ollama).
# .env (or set environment variables)
OLLAMA_API_KEY=your-key # Ollama Cloud
OPENAI_API_KEY=your-key # OpenAI
ANTHROPIC_API_KEY=your-key # Anthropic
GEMINI_API_KEY=your-key # Gemini
PERPLEXITY_API_KEY=your-key # optional — enables grounded web search
DEFAULT_PROVIDER=ollama
DEFAULT_MODEL=devstral-2:123bInstall from source:
git clone https://github.com/hybridpicker/nex-code.git
cd nex-code && npm install && npm run build
cp .env.example .env && npm link && npm run install-hooks> explain the main function in index.js
> add input validation to the createUser handler
> run the tests and fix any failures
> the /users endpoint returns 500 — find the bug and fix it
Skip all confirmations — file changes, dangerous commands, and tool permissions are auto-approved. Auto-runs caffeinate on macOS.
nex-code -yolonex-code --task "refactor src/index.js to async/await" --yolo
nex-code --prompt-file /tmp/task.txt --yolo --json
nex-code --daemon # watch mode: fires tasks on file changes, git commits, or cron| Flag | Description |
|---|---|
--task <prompt> |
Run a single prompt and exit |
--prompt-file <path> |
Read prompt from file |
--yolo |
Skip all confirmations |
--server |
JSON-lines IPC server (VS Code extension) |
--daemon |
Background watcher (reads .nex/daemon.json) |
--flatrate |
100 turns, 6 parallel agents, 5 retries |
--json |
JSON output to stdout |
--max-turns <n> |
Override agentic loop limit |
--model <spec> |
Use specific model (e.g. anthropic:claude-sonnet-4-6) |
--debug |
Show diagnostic messages |
| <<<<<<< Updated upstream |
=======
Stashed changes
> /path/to/screenshot.png implement this UI in React
> analyze https://example.com/mockup.png and implement it
> what's wrong with the layout in my clipboard # macOS clipboard capture
> screenshot localhost:3000 and review the navbar spacing
Works with Anthropic, OpenAI, Gemini, and Ollama vision models. Formats: PNG, JPG, GIF, WebP, BMP.
/model # interactive picker
/model openai:gpt-4o # switch directly
/providers # list all
/fallback anthropic,openai # auto-switch on failure
| Provider | Models | Env Variable |
|---|---|---|
| ollama | Qwen3, DeepSeek R1, Devstral, Kimi K2, MiniMax, GLM, Llama 4 | OLLAMA_API_KEY |
| openai | GPT-4o, GPT-4.1, o1, o3, o4-mini | OPENAI_API_KEY |
| anthropic | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 | ANTHROPIC_API_KEY |
| gemini | Gemini 3.1 Pro, 2.5 Pro/Flash | GEMINI_API_KEY |
| local | Any local Ollama model | (none) |
Type / to see inline suggestions. Tab completion for slash commands and file paths.
| Command | Description |
|---|---|
/help |
Full help |
/model [spec] |
Show/switch model |
/providers |
List providers |
/clear |
Clear conversation |
/save / /load / /sessions / /resume |
Session management |
/branches / /fork / /switch-branch / /goto |
Session tree navigation |
/remember / /forget / /memory |
Persistent memory |
/brain add|list|search|show|remove |
Knowledge base |
/plan [task] / /plan edit / /plan approve |
Plan mode |
/commit [msg] / /diff / /branch |
Git intelligence |
/undo / /redo / /history |
Persistent undo/redo |
/snapshot [name] / /restore |
Git snapshots |
/permissions / /allow / /deny |
Tool permissions |
/costs / /budget |
Cost tracking and limits |
/review [--strict] |
Deep code review |
/benchmark |
Model ranking (62 tasks) |
/autoresearch / /ar-self-improve |
Autonomous optimization loops |
/servers / /docker / /deploy / /k8s |
Infrastructure management |
/skills / /install-skill / /mcp / /hooks |
Extensibility |
/tree [depth] |
Project file tree |
/audit |
Tool execution audit |
/setup |
Interactive setup wizard |
45 built-in tools organized by category:
Core: bash, read_file, write_file, edit_file, patch_file, list_directory, search_files, glob, grep
Git & Web: git_status, git_diff, git_log, web_fetch, web_search
Agents: ask_user, task_list, spawn_agents, switch_model
Browser (optional, requires Playwright): browser_open, browser_screenshot, browser_click, browser_fill
GitHub Actions & K8s: gh_run_list, gh_run_view, gh_workflow_trigger, k8s_pods, k8s_logs, k8s_exec, k8s_apply, k8s_rollout
SSH & Server: ssh_exec, ssh_upload, ssh_download, service_manage, service_logs, sysadmin, remote_agent
Docker: container_list, container_logs, container_exec, container_manage
Deploy: deploy, deployment_status
Frontend: frontend_recon — scans design tokens, layout, framework stack before any frontend work
Visual: visual_diff, responsive_sweep, visual_annotate, visual_watch, design_tokens, design_compare
Additional tools via MCP servers or Skills.
Multi-goal prompts auto-decompose into parallel sub-agents. Up to 5 agents run simultaneously with file locking.
nex-code --task "fix type errors in src/, add JSDoc to utils/, update CHANGELOG"Autonomous optimization loops: edit -> experiment -> keep/revert, on a dedicated branch.
/autoresearch reduce test runtime while maintaining correctness
/ar-self-improve # self-improvement using nex-code's benchmark
Auto-activates for implementation tasks. Read-only analysis first, approve before writes. Hard-enforced tool restrictions.
<<<<<<< Updated upstream
Background process that fires tasks on file changes, git commits, or cron schedule. Configured via .nex/daemon.json. Desktop and Matrix notifications.
Background process that fires tasks on file changes, git commits, or cron schedule. Configured via .nex/daemon.json. Desktop and Matrix notifications.
Stashed changes
Navigate conversation history like git branches — fork, switch, goto, delete branches.
| Layer | What it guards | Bypass? |
|---|---|---|
| Forbidden patterns | rm -rf /, fork bombs, reverse shells, cat .env |
No |
| Protected paths | Destructive ops on .env, .ssh/, .aws/, .git/ |
NEX_UNPROTECT=1 |
| Sensitive file tools | read/write/edit on .env, .ssh/, .npmrc, .kube/ |
No |
| Critical commands | rm -rf, sudo, git push --force, git reset --hard |
Explicit confirmation |
Pre-push secret detection, audit logging (JSONL), persistent undo/redo, cost limits, auto plan mode.
- 5-layer argument parsing — JSON, trailing fix, extraction, key repair, fence stripping
- Tool call retry with schema hints — malformed args get the expected schema for self-correction
- Auto-fix engine — path resolution, edit fuzzy matching (Levenshtein), bash error hints
- Tool tiers — essential (5) / standard (21) / full (45), auto-selected per model capability
- Stale stream recovery — progressive retry with context compression on stall <<<<<<< Updated upstream
Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.
Drop .md or .js files in .nex/skills/ for project-specific knowledge, commands, and tools. Global skills in ~/.nex-code/skills/. Install from git: /install-skill user/repo.
Custom tools and lifecycle hooks via .nex/plugins/. Events: onToolResult, onModelResponse, onSessionStart, onSessionEnd, onFileChange, beforeToolExec, afterToolExec.
Connect external tool servers via Model Context Protocol. Configure in .nex/mcp.json with env var interpolation.
Run custom scripts on CLI events (pre-tool, post-tool, pre-commit, post-response, session-start, session-end). Configure in .nex/config.json or .nex/hooks/.
=======
Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.
Drop .md or .js files in .nex/skills/ for project-specific knowledge, commands, and tools. Global skills in ~/.nex-code/skills/. Install from git: /install-skill user/repo.
Custom tools and lifecycle hooks via .nex/plugins/. Events: onToolResult, onModelResponse, onSessionStart, onSessionEnd, onFileChange, beforeToolExec, afterToolExec.
Connect external tool servers via Model Context Protocol. Configure in .nex/mcp.json with env var interpolation.
Run custom scripts on CLI events (pre-tool, post-tool, pre-commit, post-response, session-start, session-end). Configure in .nex/config.json or .nex/hooks/.
Stashed changes
Built-in sidebar chat panel (vscode/) with streaming output, collapsible tool cards, and native theme support. Spawns nex-code --server over JSON-lines IPC.
cd vscode && npm install && npm run package
# Cmd+Shift+P -> Extensions: Install from VSIX...bin/nex-code.js # Entrypoint
cli/
agent.js # Agentic loop + conversation state + guards
providers/ # Ollama, OpenAI, Anthropic, Gemini, Local + wire protocols
tools/index.js # 45 tool definitions + auto-fix engine
context-engine.js # Token management + 5-phase compression
sub-agent.js # Parallel sub-agents with file locking
<<<<<<< Updated upstream
orchestrator.js # Multi-agent decompose -> execute -> synthesize
=======
orchestrator.js # Multi-agent decompose -> execute -> synthesize
>>>>>>> Stashed changes
session-tree.js # Session branching
visual.js # Visual dev tools (pixelmatch-based)
browser.js # Playwright browser agent
skills/ # Built-in + user skills
See DEVELOPMENT.md for full architecture details.
npm test # 97 suites, 3920 tests
npm run typecheck # TypeScript noEmit check
npm run benchmark:gate # 7-task smoke test (blocks push on regression)
npm run benchmark:reallife # 35 real-world tasks across 7 categories- Pre-push secret detection (API keys, private keys, hardcoded credentials)
- Audit logging with automatic argument sanitization
- Sensitive path blocking (
.ssh/,.aws/,.env, credentials) - Shell injection protection via
execFileSyncwith argument arrays - SSRF protection on
web_fetch - MCP environment isolation
Reporting vulnerabilities: Email security@schoensgibl.com (not a public issue). Allow 72h for initial response.
MIT