Code Execution - Overview

What is Code Execution?

The code_execution tool enables LLM agents to orchestrate multiple upstream MCP tools in a single request using JavaScript or TypeScript. Instead of making multiple round-trips to the model, you can execute complex multi-step workflows with conditional logic, loops, and data transformations—all within a single execution context.

TypeScript support: Set language: "typescript" to write code with type annotations, interfaces, enums, and generics. Types are automatically stripped before execution with near-zero overhead (<5ms).

When to Use Code Execution

✅ Use code_execution when:

You need to call 2+ tools and combine their results
You need conditional logic based on tool responses
You need to transform or aggregate data from multiple sources
You need to iterate over data and call tools for each item
You need to handle errors gracefully with fallbacks

❌ Don't use code_execution when:

You're calling a single tool (use call_tool directly)
The workflow is simple and linear (use sequential tool calls)
You need long-running operations (>2 minutes)
You need access to filesystem, network, or Node.js modules

Key Benefits

1. Reduced Latency

Execute multiple tool calls in a single request, eliminating the network round-trips between agent and model.

Before (3 round-trips):

Agent → Model: "Get user data"
Model → Agent: call_tool(github, get_user, {username: "octocat"})
Agent → Model: "Here's the user data"
Model → Agent: call_tool(github, list_repos, {user: "octocat"})
Agent → Model: "Here are the repos"
Model → Agent: call_tool(github, get_repo, {repo: "Hello-World"})

After (1 round-trip):

Agent → Model: "Get user and their repos"
Model → Agent: code_execution({code: "...", input: {...}})

2. Complex Logic

Implement conditional branching, loops, and error handling that would require multiple model invocations.

// Conditional logic
const user = call_tool('github', 'get_user', {username: input.username});
if (!user.ok) {
  return {error: 'User not found'};
}

// Loop with accumulation
const results = [];
for (const repoName of input.repos) {
  const repo = call_tool('github', 'get_repo', {name: repoName});
  if (repo.ok) {
    results.push(repo.result);
  }
}
return {repos: results, count: results.length};

3. Data Transformation

Transform, filter, and aggregate data from multiple tool calls before returning results.

const repos = call_tool('github', 'list_repos', {user: input.username});
if (!repos.ok) return repos;

// Filter and transform
const activeRepos = repos.result
  .filter(r => !r.archived && r.pushed_at > input.since)
  .map(r => ({name: r.name, stars: r.stargazers_count, language: r.language}));

return {repos: activeRepos, total: activeRepos.length};

How It Works

Architecture

┌─────────────────────────────────────────────┐
│  LLM Agent                                   │
│  - Receives code_execution tool description │
│  - Writes JavaScript to orchestrate tools   │
└────────────┬────────────────────────────────┘
             │
             │ code_execution request
             │ {code, input, options}
             ▼
┌─────────────────────────────────────────────┐
│  MCPProxy Server                             │
│  - Validates request and options            │
│  - Checks if feature is enabled             │
└────────────┬────────────────────────────────┘
             │
             ▼
┌─────────────────────────────────────────────┐
│  JavaScript Runtime Pool                     │
│  - Acquires VM from pool (blocks if full)   │
│  - Creates isolated sandbox                 │
│  - Binds input global and call_tool()       │
└────────────┬────────────────────────────────┘
             │
             ▼
┌─────────────────────────────────────────────┐
│  JavaScript Execution                        │
│  - Runs code with timeout watchdog          │
│  - Enforces max_tool_calls limit            │
│  - Restricts to allowed_servers             │
│  - Returns JSON-serializable result         │
└────────────┬────────────────────────────────┘
             │
             ▼
┌─────────────────────────────────────────────┐
│  Upstream Tool Calls                         │
│  - Forwards call_tool() to upstream servers │
│  - Respects quarantine and security rules   │
│  - Returns {ok, result} or {ok, error}      │
└─────────────────────────────────────────────┘

Execution Flow

Request Parsing: Extract code, input, and options from the request
Validation: Verify timeout (1-600000ms) and max_tool_calls (>= 0)
Pool Acquisition: Acquire a JavaScript VM from the pool (blocks if all VMs are in use)
Sandbox Setup: Create isolated environment with input global and call_tool() function
Execution: Run JavaScript with timeout enforcement and tool call tracking
Result Extraction: Validate result is JSON-serializable and return structured response
Pool Release: Return VM to pool for reuse
Response: Return {ok: true, value: <result>} or {ok: false, error: {...}}

Security Model

Sandbox Restrictions

The JavaScript execution environment is heavily sandboxed to prevent security issues:

❌ Not Available:

require() - No module loading
setTimeout() / setInterval() - No timers
Filesystem access - No fs module
Network access - No http or fetch
Environment variables - No process.env
Node.js built-ins - JavaScript standard library only

✅ Available:

input - Global variable with request input data
call_tool(serverName, toolName, args) - Function to call upstream MCP tools
Modern JavaScript (ES2020+) standard library including Array, Object, String, Math, Date, JSON, Map, Set, Symbol, Promise, Proxy, Reflect

Configuration & Limits

{
  "enable_code_execution": false,           // Must be explicitly enabled (default: false)
  "code_execution_timeout_ms": 120000,      // Default: 2 minutes, max: 10 minutes
  "code_execution_max_tool_calls": 0,       // Default: unlimited
  "code_execution_pool_size": 10            // Default: 10 concurrent VMs
}

Per-Request Overrides:

{
  "code": "...",
  "input": {...},
  "options": {
    "timeout_ms": 60000,              // Override timeout for this request
    "max_tool_calls": 20,             // Limit tool calls for this request
    "allowed_servers": ["github"]     // Restrict to specific servers
  }
}

Quarantine Integration

Code execution respects existing MCPProxy security features:

Quarantined servers cannot be called via call_tool()
Server enable/disable settings are enforced
Authentication requirements are preserved

Getting Started

1. Enable the Feature

Edit your configuration file (~/.mcpproxy/mcp_config.json):

{
  "enable_code_execution": true,
  "mcpServers": [
    {
      "name": "github",
      "url": "https://api.github.com/mcp",
      "protocol": "http",
      "enabled": true
    }
  ]
}

2. Restart MCPProxy

pkill mcpproxy
mcpproxy serve

3. Test with CLI

# Simple JavaScript test
mcpproxy code exec --code="({ result: input.value * 2 })" --input='{"value": 21}'

# TypeScript test
mcpproxy code exec --language typescript --code="const x: number = 42; ({ result: x })"

# Call upstream tool
mcpproxy code exec --code="call_tool('github', 'get_user', {username: input.user})" --input='{"user":"octocat"}'

4. Use from LLM Agent

The code_execution tool will appear in the tools list when an LLM agent connects to MCPProxy:

{
  "name": "code_execution",
  "description": "Execute JavaScript code that orchestrates multiple upstream MCP tools...",
  "inputSchema": {
    "type": "object",
    "properties": {
      "code": {"type": "string", "description": "JavaScript source code..."},
      "input": {"type": "object", "description": "Input data accessible as global input variable..."},
      "options": {"type": "object", "description": "Execution options..."}
    },
    "required": ["code"]
  }
}

Common Patterns

Pattern 1: Sequential Tool Calls

// Fetch user, then fetch their repos
const userRes = call_tool('github', 'get_user', {username: input.username});
if (!userRes.ok) {
  return {error: userRes.error.message};
}

const reposRes = call_tool('github', 'list_repos', {user: input.username});
if (!reposRes.ok) {
  return {error: reposRes.error.message};
}

return {
  user: userRes.result,
  repos: reposRes.result,
  repo_count: reposRes.result.length
};

Pattern 2: Conditional Logic

// Try primary server, fallback to secondary
let result = call_tool('primary-db', 'query', {sql: input.query});

if (!result.ok) {
  // Primary failed, try backup
  result = call_tool('backup-db', 'query', {sql: input.query});
}

return result.ok ? result.result : {error: 'Both databases unavailable'};

Pattern 3: Loop with Aggregation

// Fetch details for multiple items
const results = [];
const errors = [];

for (const id of input.ids) {
  const res = call_tool('api-server', 'get_item', {id});

  if (res.ok) {
    results.push(res.result);
  } else {
    errors.push({id, error: res.error});
  }
}

return {
  success: results,
  failed: errors,
  success_count: results.length,
  error_count: errors.length
};

Pattern 4: Data Transformation

// Get repos and compute statistics
const reposRes = call_tool('github', 'list_repos', {user: input.username});
if (!reposRes.ok) return reposRes;

const repos = reposRes.result;
const totalStars = repos.reduce((sum, r) => sum + (r.stargazers_count ?? 0), 0);
const languages = {};

for (const repo of repos) {
  const lang = repo.language ?? 'Unknown';
  languages[lang] = (languages[lang] ?? 0) + 1;
}

return {
  total_repos: repos.length,
  total_stars: totalStars,
  avg_stars: Math.round(totalStars / repos.length),
  languages
};

Error Handling

JavaScript Errors

// Syntax error - caught before execution
code_execution({code: "invalid javascript {"})
// Returns: {ok: false, error: {code: "SYNTAX_ERROR", message: "...", stack: "..."}}

// Runtime error - caught during execution
code_execution({code: "throw new Error('Something went wrong')"})
// Returns: {ok: false, error: {code: "RUNTIME_ERROR", message: "Something went wrong", stack: "..."}}

Tool Call Errors

// Tool returns error - handled in JavaScript
var res = call_tool('github', 'get_user', {username: 'nonexistent-user-12345'});
if (!res.ok) {
  return {error: 'User not found: ' + res.error.message};
}
return res.result;

Timeout Errors

// Execution exceeds timeout (default: 2 minutes)
code_execution({
  code: "while(true) {}",  // Infinite loop
  options: {timeout_ms: 1000}
})
// Returns: {ok: false, error: {code: "TIMEOUT", message: "JavaScript execution timed out"}}

Max Tool Calls Exceeded

// Exceeds max_tool_calls limit
code_execution({
  code: "for (var i = 0; i < 100; i++) { call_tool('api', 'ping', {}); }",
  options: {max_tool_calls: 10}
})
// Returns: {ok: false, error: {code: "MAX_TOOL_CALLS_EXCEEDED", message: "..."}}

TypeScript Support

You can write code execution scripts in TypeScript by setting the language parameter to "typescript". TypeScript types are automatically stripped before execution using esbuild, with near-zero transpilation overhead.

Supported TypeScript Features

Type annotations: const x: number = 42
Interfaces: interface User { name: string; age: number; }
Type aliases: type StringOrNumber = string | number
Generics: function identity<T>(arg: T): T { return arg; }
Enums: enum Direction { Up = "UP", Down = "DOWN" }
Namespaces: namespace MyLib { export const value = 42; }
Type assertions: const x = value as string

TypeScript via MCP Tool

{
  "name": "code_execution",
  "arguments": {
    "code": "interface User { name: string; }\nconst user: User = { name: input.username };\n({ greeting: 'Hello ' + user.name })",
    "language": "typescript",
    "input": {"username": "Alice"}
  }
}

TypeScript via REST API

curl -X POST http://127.0.0.1:8080/api/v1/code/exec \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-key" \
  -d '{
    "code": "const x: number = 42; ({ result: x })",
    "language": "typescript"
  }'

TypeScript via CLI

mcpproxy code exec --language typescript \
  --code="const x: number = 42; ({ result: x })"

Important Notes

TypeScript support uses type-stripping only (no type checking or semantic validation)
Valid JavaScript is also valid TypeScript, so you can always use language: "typescript" even for plain JS
The transpiled output runs in the same ES2020+ goja sandbox with all existing capabilities
Transpilation errors return the TRANSPILE_ERROR error code with line/column information

Best Practices

1. Keep Code Simple

Use modern JavaScript syntax (arrow functions, const/let, template literals, destructuring are all supported)
Avoid deeply nested logic
Prefer explicit error handling over implicit failures

2. Handle Errors Gracefully

// Bad: Assumes success
var user = call_tool('github', 'get_user', {username: input.username});
return user.result.name;  // Crashes if user.ok is false

// Good: Checks response
var user = call_tool('github', 'get_user', {username: input.username});
if (!user.ok) {
  return {error: user.error.message};
}
return {name: user.result.name};

3. Set Appropriate Timeouts

// Quick operations: 30 seconds
{options: {timeout_ms: 30000}}

// Multiple tool calls: 2 minutes (default)
{options: {timeout_ms: 120000}}

// Heavy processing: 5 minutes
{options: {timeout_ms: 300000}}

4. Limit Tool Calls

// Protect against runaway loops
{
  code: "for (var i = 0; i < input.items.length; i++) { ... }",
  options: {max_tool_calls: 100}
}

5. Use Allowed Servers

// Restrict to specific servers for sensitive operations
{
  code: "call_tool('production-db', 'delete', {id: input.id})",
  options: {allowed_servers: ["production-db"]}
}

Next Steps

Examples: See examples.md for 10+ working code samples
API Reference: See api-reference.md for complete schema documentation
Troubleshooting: See troubleshooting.md for common issues and solutions
CLI Usage: Run mcpproxy code exec --help for command-line testing

FilesExpand file tree

overview.md

Latest commit

History