nex-code

Run 400B+ open coding models on your codebase — without the hardware bill.
Ollama Cloud first. OpenAI, Anthropic, and Gemini when you need them.

npx nex-code

If this saves you time, a star helps others find it.

Demo

nex-code-demo-0-4-22.mov

Quickstart

npx nex-code
# or install globally:
npm install -g nex-code && cd ~/your-project && nex-code

On first launch, an interactive setup wizard guides you through provider and credential configuration. Re-run anytime with /setup.

Why nex-code?

Ollama Cloud first. Built and optimized for Ollama Cloud — the flat-rate platform running devstral, Kimi K2, Qwen3-Coder, and 47+ models. Other providers (OpenAI, Anthropic, Gemini) work via the same interface.

Feature	nex-code	Closed-source alternatives
Free tier	Ollama Cloud flat-rate	subscription or limited quota
Open models	devstral, Kimi K2, Qwen3	vendor-locked
Local Ollama	yes	no
Multi-provider	swap with one env var	no
VS Code sidebar	built-in	partial
Startup time	~100ms	1-4s
Runtime deps	2	heavy
Infra tools	SSH, Docker, K8s built-in	no
<<<<<<< Updated upstream

Smart model routing. The built-in /benchmark tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.

Phase-based execution. Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.

45 built-in tools across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See Tools for the full list.

=======

Smart model routing. The built-in /benchmark tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.

Phase-based execution. Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.

45 built-in tools across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See Tools for the full list.

Stashed changes 2 runtime dependencies (axios, dotenv). Starts in ~100ms. No Python, no heavy runtime.

Ollama Cloud Model Rankings

Rankings from nex-code's own /benchmark — 62 tasks testing tool selection, argument validity, and schema compliance.

Rank	Model	Score	Avg Latency	Context	Best For
🥇	`qwen3-vl:235b`	79	12.4s	131K	Overall #1 — frontier tool selection, data + agentic tasks
🥈	`qwen3-vl:235b-instruct`	78.2	5.3s	131K	Best latency/score balance — recommended default
🥉	`nemotron-3-super`	78.1	3.5s	256K	—
—	`rnj-1:8b`	77.4	3.9s	131K	—
—	`mistral-large-3:675b`	76.5	3.9s	131K	—
—	`gpt-oss:20b`	76.5	1.9s	131K	Fast small model, good overall score
—	`qwen3-coder-next`	75.7	2.2s	256K	—
—	`qwen3-next:80b`	75.1	11.1s	131K	—
—	`ministral-3:8b`	73.8	2.0s	131K	Fastest strong model — 2.2s latency, 70+ score
—	`deepseek-v3.1:671b`	73.6	2.9s	131K	—
—	`devstral-2:123b`	73.2	2.0s	131K	Sysadmin + SSH tasks, reliable coding
—	`kimi-k2:1t`	72.2	5.6s	256K	Large repos (>100K tokens)
—	`ministral-3:3b`	72	1.6s	32K	—
—	`devstral-small-2:24b`	71.7	2.6s	131K	Fast sub-agents, simple lookups
—	`qwen3.5:397b`	70.7	4.2s	256K	—
—	`qwen3-coder:480b`	70.1	6.0s	131K	Heavy coding sessions, large context
—	`minimax-m2.1`	69.9	3.0s	200K	—
—	`gemma4:31b`	69.3	2.8s	?	—
—	`glm-4.7`	69.1	5.3s	131K	—
—	`kimi-k2-thinking`	69	3.1s	256K	—
—	`ministral-3:14b`	68.8	2.0s	131K	—
—	`kimi-k2.5`	68.7	3.4s	256K	Large repos — faster than k2:1t
—	`minimax-m2.7`	68.4	5.5s	200K	—
—	`glm-4.6`	67.8	4.7s	131K	—
—	`glm-5`	67.4	5.0s	131K	—
—	`gpt-oss:120b`	64.8	3.4s	131K	—
—	`nemotron-3-nano:30b`	64.7	2.3s	131K	—
—	`minimax-m2.5`	61.9	2.7s	131K	Multi-agent, large context
—	`minimax-m2`	60.6	4.3s	200K	—

Rankings are nex-code-specific: tool name accuracy, argument validity, schema compliance. Toolathon (Minimax SOTA) measures different task types — run /benchmark --discover after model releases.

Recommended .env:

DEFAULT_PROVIDER=ollama
DEFAULT_MODEL=devstral-2:123b
NEX_HEAVY_MODEL=qwen3-coder:480b
NEX_STANDARD_MODEL=devstral-2:123b
NEX_FAST_MODEL=devstral-small-2:24b

Setup

Prerequisites: Node.js 18+ and at least one API key (or local Ollama).

# .env (or set environment variables)
OLLAMA_API_KEY=your-key       # Ollama Cloud
OPENAI_API_KEY=your-key       # OpenAI
ANTHROPIC_API_KEY=your-key    # Anthropic
GEMINI_API_KEY=your-key       # Gemini
PERPLEXITY_API_KEY=your-key   # optional — enables grounded web search

DEFAULT_PROVIDER=ollama
DEFAULT_MODEL=devstral-2:123b

Install from source:

git clone https://github.com/hybridpicker/nex-code.git
cd nex-code && npm install && npm run build
cp .env.example .env && npm link && npm run install-hooks

Usage

> explain the main function in index.js
> add input validation to the createUser handler
> run the tests and fix any failures
> the /users endpoint returns 500 — find the bug and fix it

YOLO Mode

Skip all confirmations — file changes, dangerous commands, and tool permissions are auto-approved. Auto-runs caffeinate on macOS.

nex-code -yolo

Headless / Programmatic Mode

nex-code --task "refactor src/index.js to async/await" --yolo
nex-code --prompt-file /tmp/task.txt --yolo --json
nex-code --daemon          # watch mode: fires tasks on file changes, git commits, or cron

Flag	Description
`--task <prompt>`	Run a single prompt and exit
`--prompt-file <path>`	Read prompt from file
`--yolo`	Skip all confirmations
`--server`	JSON-lines IPC server (VS Code extension)
`--daemon`	Background watcher (reads `.nex/daemon.json`)
`--flatrate`	100 turns, 6 parallel agents, 5 retries
`--json`	JSON output to stdout
`--max-turns <n>`	Override agentic loop limit
`--model <spec>`	Use specific model (e.g. `anthropic:claude-sonnet-4-6`)
`--debug`	Show diagnostic messages
<<<<<<< Updated upstream

Vision / Screenshot

=======

Vision / Screenshot

Stashed changes

> /path/to/screenshot.png implement this UI in React
> analyze https://example.com/mockup.png and implement it
> what's wrong with the layout in my clipboard    # macOS clipboard capture
> screenshot localhost:3000 and review the navbar spacing

Works with Anthropic, OpenAI, Gemini, and Ollama vision models. Formats: PNG, JPG, GIF, WebP, BMP.

Providers & Models

/model                         # interactive picker
/model openai:gpt-4o           # switch directly
/providers                     # list all
/fallback anthropic,openai     # auto-switch on failure

Provider	Models	Env Variable
ollama	Qwen3, DeepSeek R1, Devstral, Kimi K2, MiniMax, GLM, Llama 4	`OLLAMA_API_KEY`
openai	GPT-4o, GPT-4.1, o1, o3, o4-mini	`OPENAI_API_KEY`
anthropic	Claude Opus 4.6, Sonnet 4.6, Haiku 4.5	`ANTHROPIC_API_KEY`
gemini	Gemini 3.1 Pro, 2.5 Pro/Flash	`GEMINI_API_KEY`
local	Any local Ollama model	(none)

Commands

Type / to see inline suggestions. Tab completion for slash commands and file paths.

Command	Description
`/help`	Full help
`/model [spec]`	Show/switch model
`/providers`	List providers
`/clear`	Clear conversation
`/save` / `/load` / `/sessions` / `/resume`	Session management
`/branches` / `/fork` / `/switch-branch` / `/goto`	Session tree navigation
`/remember` / `/forget` / `/memory`	Persistent memory
`/brain add\|list\|search\|show\|remove`	Knowledge base
`/plan [task]` / `/plan edit` / `/plan approve`	Plan mode
`/commit [msg]` / `/diff` / `/branch`	Git intelligence
`/undo` / `/redo` / `/history`	Persistent undo/redo
`/snapshot [name]` / `/restore`	Git snapshots
`/permissions` / `/allow` / `/deny`	Tool permissions
`/costs` / `/budget`	Cost tracking and limits
`/review [--strict]`	Deep code review
`/benchmark`	Model ranking (62 tasks)
`/autoresearch` / `/ar-self-improve`	Autonomous optimization loops
`/servers` / `/docker` / `/deploy` / `/k8s`	Infrastructure management
`/skills` / `/install-skill` / `/mcp` / `/hooks`	Extensibility
`/tree [depth]`	Project file tree
`/audit`	Tool execution audit
`/setup`	Interactive setup wizard

Tools

45 built-in tools organized by category:

Core: bash, read_file, write_file, edit_file, patch_file, list_directory, search_files, glob, grep

Git & Web: git_status, git_diff, git_log, web_fetch, web_search

Agents: ask_user, task_list, spawn_agents, switch_model

Browser (optional, requires Playwright): browser_open, browser_screenshot, browser_click, browser_fill

GitHub Actions & K8s: gh_run_list, gh_run_view, gh_workflow_trigger, k8s_pods, k8s_logs, k8s_exec, k8s_apply, k8s_rollout

SSH & Server: ssh_exec, ssh_upload, ssh_download, service_manage, service_logs, sysadmin, remote_agent

Docker: container_list, container_logs, container_exec, container_manage

Deploy: deploy, deployment_status

Frontend: frontend_recon — scans design tokens, layout, framework stack before any frontend work

Visual: visual_diff, responsive_sweep, visual_annotate, visual_watch, design_tokens, design_compare

Additional tools via MCP servers or Skills.

Key Features

Multi-Agent Orchestrator

Multi-goal prompts auto-decompose into parallel sub-agents. Up to 5 agents run simultaneously with file locking.

nex-code --task "fix type errors in src/, add JSDoc to utils/, update CHANGELOG"

Autoresearch

Autonomous optimization loops: edit -> experiment -> keep/revert, on a dedicated branch.

/autoresearch reduce test runtime while maintaining correctness
/ar-self-improve          # self-improvement using nex-code's benchmark

Plan Mode

Auto-activates for implementation tasks. Read-only analysis first, approve before writes. Hard-enforced tool restrictions.

Daemon / Watch Mode

<<<<<<< Updated upstream Background process that fires tasks on file changes, git commits, or cron schedule. Configured via `.nex/daemon.json`. Desktop and Matrix notifications.

Background process that fires tasks on file changes, git commits, or cron schedule. Configured via .nex/daemon.json. Desktop and Matrix notifications.

Stashed changes

Session Trees

Navigate conversation history like git branches — fork, switch, goto, delete branches.

Safety

Layer	What it guards	Bypass?
Forbidden patterns	`rm -rf /`, fork bombs, reverse shells, `cat .env`	No
Protected paths	Destructive ops on `.env`, `.ssh/`, `.aws/`, `.git/`	`NEX_UNPROTECT=1`
Sensitive file tools	read/write/edit on `.env`, `.ssh/`, `.npmrc`, `.kube/`	No
Critical commands	`rm -rf`, `sudo`, `git push --force`, `git reset --hard`	Explicit confirmation

Pre-push secret detection, audit logging (JSONL), persistent undo/redo, cost limits, auto plan mode.

Open-Source Model Robustness

5-layer argument parsing — JSON, trailing fix, extraction, key repair, fence stripping
Tool call retry with schema hints — malformed args get the expected schema for self-correction
Auto-fix engine — path resolution, edit fuzzy matching (Levenshtein), bash error hints
Tool tiers — essential (5) / standard (21) / full (45), auto-selected per model capability
Stale stream recovery — progressive retry with context compression on stall <<<<<<< Updated upstream

Visual Development Tools

Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.

Extensibility

Skills

Drop .md or .js files in .nex/skills/ for project-specific knowledge, commands, and tools. Global skills in ~/.nex-code/skills/. Install from git: /install-skill user/repo.

Plugins

Custom tools and lifecycle hooks via .nex/plugins/. Events: onToolResult, onModelResponse, onSessionStart, onSessionEnd, onFileChange, beforeToolExec, afterToolExec.

MCP

Connect external tool servers via Model Context Protocol. Configure in .nex/mcp.json with env var interpolation.

Hooks

Run custom scripts on CLI events (pre-tool, post-tool, pre-commit, post-response, session-start, session-end). Configure in .nex/config.json or .nex/hooks/.

=======

Visual Development Tools

Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.

Extensibility

Skills

Drop .md or .js files in .nex/skills/ for project-specific knowledge, commands, and tools. Global skills in ~/.nex-code/skills/. Install from git: /install-skill user/repo.

Plugins

Custom tools and lifecycle hooks via .nex/plugins/. Events: onToolResult, onModelResponse, onSessionStart, onSessionEnd, onFileChange, beforeToolExec, afterToolExec.

MCP

Connect external tool servers via Model Context Protocol. Configure in .nex/mcp.json with env var interpolation.

Hooks

Run custom scripts on CLI events (pre-tool, post-tool, pre-commit, post-response, session-start, session-end). Configure in .nex/config.json or .nex/hooks/.

Stashed changes

VS Code Extension

Built-in sidebar chat panel (vscode/) with streaming output, collapsible tool cards, and native theme support. Spawns nex-code --server over JSON-lines IPC.

cd vscode && npm install && npm run package
# Cmd+Shift+P -> Extensions: Install from VSIX...

Architecture

bin/nex-code.js          # Entrypoint
cli/
  agent.js               # Agentic loop + conversation state + guards
  providers/             # Ollama, OpenAI, Anthropic, Gemini, Local + wire protocols
  tools/index.js         # 45 tool definitions + auto-fix engine
  context-engine.js      # Token management + 5-phase compression
  sub-agent.js           # Parallel sub-agents with file locking
<<<<<<< Updated upstream
orchestrator.js        # Multi-agent decompose -> execute -> synthesize
=======
  orchestrator.js        # Multi-agent decompose -> execute -> synthesize
>>>>>>> Stashed changes
  session-tree.js        # Session branching
  visual.js              # Visual dev tools (pixelmatch-based)
  browser.js             # Playwright browser agent
  skills/                # Built-in + user skills

See DEVELOPMENT.md for full architecture details.

Testing

npm test              # 97 suites, 3920 tests
npm run typecheck     # TypeScript noEmit check
npm run benchmark:gate        # 7-task smoke test (blocks push on regression)
npm run benchmark:reallife    # 35 real-world tasks across 7 categories

Security

Pre-push secret detection (API keys, private keys, hardcoded credentials)
Audit logging with automatic argument sanitization
Sensitive path blocking (.ssh/, .aws/, .env, credentials)
Shell injection protection via execFileSync with argument arrays
SSRF protection on web_fetch
MCP environment isolation

Reporting vulnerabilities: Email security@schoensgibl.com (not a public issue). Allow 72h for initial response.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 983 Commits
.github		.github
benchmark		benchmark
bin		bin
cli		cli
dist		dist
docs		docs
examples		examples
hooks		hooks
scripts		scripts
tests		tests
types		types
vscode		vscode
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
DEPS.md		DEPS.md
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
Makefile		Makefile
NEX.md		NEX.md
PERFORMANCE_SUMMARY.md		PERFORMANCE_SUMMARY.md
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
benchmark.js		benchmark.js
package-lock.json		package-lock.json
package.json		package.json
person_dataclass.py		person_dataclass.py
sieve_of_eratosthenes.py		sieve_of_eratosthenes.py
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

nex-code

Demo

Quickstart

Why nex-code?

Ollama Cloud Model Rankings

Setup

Usage

YOLO Mode

Headless / Programmatic Mode

Vision / Screenshot

Vision / Screenshot

Providers & Models

Commands

Tools

Key Features

Multi-Agent Orchestrator

Autoresearch

Plan Mode

Daemon / Watch Mode

<<<<<<< Updated upstream Background process that fires tasks on file changes, git commits, or cron schedule. Configured via .nex/daemon.json. Desktop and Matrix notifications.

Session Trees

Safety

Open-Source Model Robustness

Visual Development Tools

Extensibility

Skills

Plugins

MCP

Hooks

Visual Development Tools

Extensibility

Skills

Plugins

MCP

Hooks

VS Code Extension

Architecture

Testing

Security

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 95

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

<<<<<<< Updated upstream Background process that fires tasks on file changes, git commits, or cron schedule. Configured via `.nex/daemon.json`. Desktop and Matrix notifications.

Packages