Skip to content

hybridpicker/nex-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

983 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nex-code

Run 400B+ open coding models on your codebase — without the hardware bill.
Ollama Cloud first. OpenAI, Anthropic, and Gemini when you need them.

npx nex-code

If this saves you time, a star helps others find it.

npm version npm downloads GitHub Stars CI License: MIT Ollama Cloud: supported Node >= 18 Dependencies: 2 Tests: 3920 VS Code extension


Demo

nex-code-demo-0-4-22.mov

Quickstart

npx nex-code
# or install globally:
npm install -g nex-code && cd ~/your-project && nex-code

On first launch, an interactive setup wizard guides you through provider and credential configuration. Re-run anytime with /setup.


Why nex-code?

Ollama Cloud first. Built and optimized for Ollama Cloud — the flat-rate platform running devstral, Kimi K2, Qwen3-Coder, and 47+ models. Other providers (OpenAI, Anthropic, Gemini) work via the same interface.

Feature nex-code Closed-source alternatives
Free tier Ollama Cloud flat-rate subscription or limited quota
Open models devstral, Kimi K2, Qwen3 vendor-locked
Local Ollama yes no
Multi-provider swap with one env var no
VS Code sidebar built-in partial
Startup time ~100ms 1-4s
Runtime deps 2 heavy
Infra tools SSH, Docker, K8s built-in no
<<<<<<< Updated upstream

Smart model routing. The built-in /benchmark tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.

Phase-based execution. Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.

45 built-in tools across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See Tools for the full list.

=======

Smart model routing. The built-in /benchmark tests all configured models across 62 tool-calling tasks in 5 categories and auto-routes to the best model per task type.

Phase-based execution. Tasks run through Plan (analyze) -> Implement (code) -> Verify (test) phases, each with the optimal model. Auto-loops back on test failures.

45 built-in tools across file ops, git, SSH, Docker, Kubernetes, deploy, browser, GitHub Actions, and visual review. See Tools for the full list.

Stashed changes 2 runtime dependencies (axios, dotenv). Starts in ~100ms. No Python, no heavy runtime.


Ollama Cloud Model Rankings

Rankings from nex-code's own /benchmark — 62 tasks testing tool selection, argument validity, and schema compliance.

Rank Model Score Avg Latency Context Best For
🥇 qwen3-vl:235b 79 12.4s 131K Overall #1 — frontier tool selection, data + agentic tasks
🥈 qwen3-vl:235b-instruct 78.2 5.3s 131K Best latency/score balance — recommended default
🥉 nemotron-3-super 78.1 3.5s 256K
rnj-1:8b 77.4 3.9s 131K
mistral-large-3:675b 76.5 3.9s 131K
gpt-oss:20b 76.5 1.9s 131K Fast small model, good overall score
qwen3-coder-next 75.7 2.2s 256K
qwen3-next:80b 75.1 11.1s 131K
ministral-3:8b 73.8 2.0s 131K Fastest strong model — 2.2s latency, 70+ score
deepseek-v3.1:671b 73.6 2.9s 131K
devstral-2:123b 73.2 2.0s 131K Sysadmin + SSH tasks, reliable coding
kimi-k2:1t 72.2 5.6s 256K Large repos (>100K tokens)
ministral-3:3b 72 1.6s 32K
devstral-small-2:24b 71.7 2.6s 131K Fast sub-agents, simple lookups
qwen3.5:397b 70.7 4.2s 256K
qwen3-coder:480b 70.1 6.0s 131K Heavy coding sessions, large context
minimax-m2.1 69.9 3.0s 200K
gemma4:31b 69.3 2.8s ?
glm-4.7 69.1 5.3s 131K
kimi-k2-thinking 69 3.1s 256K
ministral-3:14b 68.8 2.0s 131K
kimi-k2.5 68.7 3.4s 256K Large repos — faster than k2:1t
minimax-m2.7 68.4 5.5s 200K
glm-4.6 67.8 4.7s 131K
glm-5 67.4 5.0s 131K
gpt-oss:120b 64.8 3.4s 131K
nemotron-3-nano:30b 64.7 2.3s 131K
minimax-m2.5 61.9 2.7s 131K Multi-agent, large context
minimax-m2 60.6 4.3s 200K

Rankings are nex-code-specific: tool name accuracy, argument validity, schema compliance. Toolathon (Minimax SOTA) measures different task types — run /benchmark --discover after model releases.

Recommended .env:

DEFAULT_PROVIDER=ollama
DEFAULT_MODEL=devstral-2:123b
NEX_HEAVY_MODEL=qwen3-coder:480b
NEX_STANDARD_MODEL=devstral-2:123b
NEX_FAST_MODEL=devstral-small-2:24b

Setup

Prerequisites: Node.js 18+ and at least one API key (or local Ollama).

# .env (or set environment variables)
OLLAMA_API_KEY=your-key       # Ollama Cloud
OPENAI_API_KEY=your-key       # OpenAI
ANTHROPIC_API_KEY=your-key    # Anthropic
GEMINI_API_KEY=your-key       # Gemini
PERPLEXITY_API_KEY=your-key   # optional — enables grounded web search

DEFAULT_PROVIDER=ollama
DEFAULT_MODEL=devstral-2:123b

Install from source:

git clone https://github.com/hybridpicker/nex-code.git
cd nex-code && npm install && npm run build
cp .env.example .env && npm link && npm run install-hooks

Usage

> explain the main function in index.js
> add input validation to the createUser handler
> run the tests and fix any failures
> the /users endpoint returns 500 — find the bug and fix it

YOLO Mode

Skip all confirmations — file changes, dangerous commands, and tool permissions are auto-approved. Auto-runs caffeinate on macOS.

nex-code -yolo

Headless / Programmatic Mode

nex-code --task "refactor src/index.js to async/await" --yolo
nex-code --prompt-file /tmp/task.txt --yolo --json
nex-code --daemon          # watch mode: fires tasks on file changes, git commits, or cron
Flag Description
--task <prompt> Run a single prompt and exit
--prompt-file <path> Read prompt from file
--yolo Skip all confirmations
--server JSON-lines IPC server (VS Code extension)
--daemon Background watcher (reads .nex/daemon.json)
--flatrate 100 turns, 6 parallel agents, 5 retries
--json JSON output to stdout
--max-turns <n> Override agentic loop limit
--model <spec> Use specific model (e.g. anthropic:claude-sonnet-4-6)
--debug Show diagnostic messages
<<<<<<< Updated upstream

Vision / Screenshot

=======

Vision / Screenshot

Stashed changes

> /path/to/screenshot.png implement this UI in React
> analyze https://example.com/mockup.png and implement it
> what's wrong with the layout in my clipboard    # macOS clipboard capture
> screenshot localhost:3000 and review the navbar spacing

Works with Anthropic, OpenAI, Gemini, and Ollama vision models. Formats: PNG, JPG, GIF, WebP, BMP.


Providers & Models

/model                         # interactive picker
/model openai:gpt-4o           # switch directly
/providers                     # list all
/fallback anthropic,openai     # auto-switch on failure
Provider Models Env Variable
ollama Qwen3, DeepSeek R1, Devstral, Kimi K2, MiniMax, GLM, Llama 4 OLLAMA_API_KEY
openai GPT-4o, GPT-4.1, o1, o3, o4-mini OPENAI_API_KEY
anthropic Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 ANTHROPIC_API_KEY
gemini Gemini 3.1 Pro, 2.5 Pro/Flash GEMINI_API_KEY
local Any local Ollama model (none)

Commands

Type / to see inline suggestions. Tab completion for slash commands and file paths.

Command Description
/help Full help
/model [spec] Show/switch model
/providers List providers
/clear Clear conversation
/save / /load / /sessions / /resume Session management
/branches / /fork / /switch-branch / /goto Session tree navigation
/remember / /forget / /memory Persistent memory
/brain add|list|search|show|remove Knowledge base
/plan [task] / /plan edit / /plan approve Plan mode
/commit [msg] / /diff / /branch Git intelligence
/undo / /redo / /history Persistent undo/redo
/snapshot [name] / /restore Git snapshots
/permissions / /allow / /deny Tool permissions
/costs / /budget Cost tracking and limits
/review [--strict] Deep code review
/benchmark Model ranking (62 tasks)
/autoresearch / /ar-self-improve Autonomous optimization loops
/servers / /docker / /deploy / /k8s Infrastructure management
/skills / /install-skill / /mcp / /hooks Extensibility
/tree [depth] Project file tree
/audit Tool execution audit
/setup Interactive setup wizard

Tools

45 built-in tools organized by category:

Core: bash, read_file, write_file, edit_file, patch_file, list_directory, search_files, glob, grep

Git & Web: git_status, git_diff, git_log, web_fetch, web_search

Agents: ask_user, task_list, spawn_agents, switch_model

Browser (optional, requires Playwright): browser_open, browser_screenshot, browser_click, browser_fill

GitHub Actions & K8s: gh_run_list, gh_run_view, gh_workflow_trigger, k8s_pods, k8s_logs, k8s_exec, k8s_apply, k8s_rollout

SSH & Server: ssh_exec, ssh_upload, ssh_download, service_manage, service_logs, sysadmin, remote_agent

Docker: container_list, container_logs, container_exec, container_manage

Deploy: deploy, deployment_status

Frontend: frontend_recon — scans design tokens, layout, framework stack before any frontend work

Visual: visual_diff, responsive_sweep, visual_annotate, visual_watch, design_tokens, design_compare

Additional tools via MCP servers or Skills.


Key Features

Multi-Agent Orchestrator

Multi-goal prompts auto-decompose into parallel sub-agents. Up to 5 agents run simultaneously with file locking.

nex-code --task "fix type errors in src/, add JSDoc to utils/, update CHANGELOG"

Autoresearch

Autonomous optimization loops: edit -> experiment -> keep/revert, on a dedicated branch.

/autoresearch reduce test runtime while maintaining correctness
/ar-self-improve          # self-improvement using nex-code's benchmark

Plan Mode

Auto-activates for implementation tasks. Read-only analysis first, approve before writes. Hard-enforced tool restrictions.

Daemon / Watch Mode

<<<<<<< Updated upstream Background process that fires tasks on file changes, git commits, or cron schedule. Configured via .nex/daemon.json. Desktop and Matrix notifications.

Background process that fires tasks on file changes, git commits, or cron schedule. Configured via .nex/daemon.json. Desktop and Matrix notifications.

Stashed changes

Session Trees

Navigate conversation history like git branches — fork, switch, goto, delete branches.

Safety

Layer What it guards Bypass?
Forbidden patterns rm -rf /, fork bombs, reverse shells, cat .env No
Protected paths Destructive ops on .env, .ssh/, .aws/, .git/ NEX_UNPROTECT=1
Sensitive file tools read/write/edit on .env, .ssh/, .npmrc, .kube/ No
Critical commands rm -rf, sudo, git push --force, git reset --hard Explicit confirmation

Pre-push secret detection, audit logging (JSONL), persistent undo/redo, cost limits, auto plan mode.

Open-Source Model Robustness

  • 5-layer argument parsing — JSON, trailing fix, extraction, key repair, fence stripping
  • Tool call retry with schema hints — malformed args get the expected schema for self-correction
  • Auto-fix engine — path resolution, edit fuzzy matching (Levenshtein), bash error hints
  • Tool tiers — essential (5) / standard (21) / full (45), auto-selected per model capability
  • Stale stream recovery — progressive retry with context compression on stall <<<<<<< Updated upstream

Visual Development Tools

Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.


Extensibility

Skills

Drop .md or .js files in .nex/skills/ for project-specific knowledge, commands, and tools. Global skills in ~/.nex-code/skills/. Install from git: /install-skill user/repo.

Plugins

Custom tools and lifecycle hooks via .nex/plugins/. Events: onToolResult, onModelResponse, onSessionStart, onSessionEnd, onFileChange, beforeToolExec, afterToolExec.

MCP

Connect external tool servers via Model Context Protocol. Configure in .nex/mcp.json with env var interpolation.

Hooks

Run custom scripts on CLI events (pre-tool, post-tool, pre-commit, post-response, session-start, session-end). Configure in .nex/config.json or .nex/hooks/.


=======

Visual Development Tools

Pixel-level before/after comparison, responsive sweeps (320-1920px), annotation overlays, design token extraction, and live-reload diff watching. Pure image tools work standalone; browser-based tools need Playwright.


Extensibility

Skills

Drop .md or .js files in .nex/skills/ for project-specific knowledge, commands, and tools. Global skills in ~/.nex-code/skills/. Install from git: /install-skill user/repo.

Plugins

Custom tools and lifecycle hooks via .nex/plugins/. Events: onToolResult, onModelResponse, onSessionStart, onSessionEnd, onFileChange, beforeToolExec, afterToolExec.

MCP

Connect external tool servers via Model Context Protocol. Configure in .nex/mcp.json with env var interpolation.

Hooks

Run custom scripts on CLI events (pre-tool, post-tool, pre-commit, post-response, session-start, session-end). Configure in .nex/config.json or .nex/hooks/.


Stashed changes

VS Code Extension

Built-in sidebar chat panel (vscode/) with streaming output, collapsible tool cards, and native theme support. Spawns nex-code --server over JSON-lines IPC.

cd vscode && npm install && npm run package
# Cmd+Shift+P -> Extensions: Install from VSIX...

Architecture

bin/nex-code.js          # Entrypoint
cli/
  agent.js               # Agentic loop + conversation state + guards
  providers/             # Ollama, OpenAI, Anthropic, Gemini, Local + wire protocols
  tools/index.js         # 45 tool definitions + auto-fix engine
  context-engine.js      # Token management + 5-phase compression
  sub-agent.js           # Parallel sub-agents with file locking
<<<<<<< Updated upstream
orchestrator.js        # Multi-agent decompose -> execute -> synthesize
=======
  orchestrator.js        # Multi-agent decompose -> execute -> synthesize
>>>>>>> Stashed changes
  session-tree.js        # Session branching
  visual.js              # Visual dev tools (pixelmatch-based)
  browser.js             # Playwright browser agent
  skills/                # Built-in + user skills

See DEVELOPMENT.md for full architecture details.


Testing

npm test              # 97 suites, 3920 tests
npm run typecheck     # TypeScript noEmit check
npm run benchmark:gate        # 7-task smoke test (blocks push on regression)
npm run benchmark:reallife    # 35 real-world tasks across 7 categories

Security

  • Pre-push secret detection (API keys, private keys, hardcoded credentials)
  • Audit logging with automatic argument sanitization
  • Sensitive path blocking (.ssh/, .aws/, .env, credentials)
  • Shell injection protection via execFileSync with argument arrays
  • SSRF protection on web_fetch
  • MCP environment isolation

Reporting vulnerabilities: Email security@schoensgibl.com (not a public issue). Allow 72h for initial response.


License

MIT

Packages

 
 
 

Contributors

Languages