Skip to content

Latest commit

 

History

History
864 lines (661 loc) · 33.1 KB

File metadata and controls

864 lines (661 loc) · 33.1 KB

process-mcp

A Model Context Protocol server that supervises long-running dev processes — start, stop, restart, tail logs (JSON-aware), wait for readiness, run one-shots — over HTTP, from its own process. State survives Claude Code restarts.

Claude Code ──HTTP──▶  process-mcp  ──spawn──▶  your dev servers
                       │
                       ├── in-memory registry (running + recently exited)
                       ├── per-process log ring buffers
                       ├── JSON log parsing & filtering
                       └── TTL sweeper, session limits, tree-kill

Contents


Why

During development you routinely juggle dev servers, workers, watchers. The pain points that process-mcp attacks:

  • "Is it still running? On what port? What did it just log?"proc_list, proc_logs, proc_port_who answer all three in one structured response.
  • Claude's Bash tool is awkward for long-lived processes. It blocks or backgrounds with no way to inspect logs later. process-mcp owns the child, captures stdout/stderr into a ring buffer, and lets you query after the fact.
  • Restarting Claude Code kills your in-session shells. process-mcp lives in its own process, so the services stay up and the registry is preserved until you restart the supervisor itself.
  • Grepping JSON logs via shell is tedious. process-mcp parses each line that looks like JSON, normalizes the level, and exposes level/min_level/where filters that work over structured fields.
  • npm run foo leaves orphaned children behind. proc_stop signals the whole process tree (kill -pgid on POSIX, taskkill /T on Windows) so the node it spawned dies with it.

Quick start

1. Install

The recommended transport is HTTP, the server runs as a daemon.

# Published package (preferred)
npm install -g @graphmemory/process-mcp
process-mcp

# Or without install
npx -y @graphmemory/process-mcp

# Or Docker (see "Docker" below for auth requirements)
docker run --rm -p 7778:7778 -e PROC_MCP_API_KEY=$(openssl rand -hex 32) \
  ghcr.io/graph-memory/process-mcp:latest

Boot log:

process-mcp listening on http://127.0.0.1:7778/mcp
  health      → http://127.0.0.1:7778/health
  auth        → DISABLED (loopback only)
  cors_origin → null
  max         → 100 sessions, 200 processes

2. Register in Claude Code

Add to ~/.claude.json (user-global) or .mcp.json (project-local):

{
  "mcpServers": {
    "procs": {
      "type": "http",
      "url": "http://127.0.0.1:7778/mcp"
    }
  }
}

With auth (required when bound to a non-loopback interface):

{
  "mcpServers": {
    "procs": {
      "type": "http",
      "url": "http://127.0.0.1:7778/mcp",
      "headers": { "Authorization": "Bearer ${PROC_MCP_API_KEY}" }
    }
  }
}

3. First conversation

Ask Claude in plain language:

start my api with npm run api:dev, wait until /health/live returns 200, then tail the last 50 error logs

It will sequence proc_startproc_wait_readyproc_logs for you. Iterate without leaving the conversation: proc_restart, change code, repeat.

4. Sample flow (what the tools actually look like)

proc_start { name: "api", cmd: "npm run dev", port: 3000 }
proc_wait_ready { name: "api", url: "http://127.0.0.1:3000/health", timeout_ms: 30000 }
proc_logs { name: "api", min_level: "warn", tail: 50 }
proc_logs { name: "api", where: "req.method=POST,req.url~=^/api", since: "5m" }
proc_restart { name: "api" }
proc_stop { name: "api" }
proc_remove { name: "api" }

proc_exec { cmd: "npm test", timeout_ms: 120000 }
proc_exec { cmd: ["jq", "."], input: "{\"a\":1}" }

proc_port_who { port: 3000 }

Features

  • 11 tools covering the full supervisor lifecycle, see Tools reference.
  • Survives Claude Code restarts. process-mcp runs in its own OS process. Restart Claude Code at will — your api/workers keep running, the registry stays.
  • Crash-recovery reaper. Each spawn is recorded to a state file in os.tmpdir(). If the supervisor is kill -9'd and restarted, the new instance detects children left behind by the old one and tree-kills them.
  • Supervisor-env isolation. All PROC_MCP_* variables (API key, host, caps) are filtered out before spawning, so secrets don't leak into child processes.
  • prefer_local option. Opt-in per spawn; prepends <cwd>/node_modules/.bin to PATH so vite, tsc, eslint, jest resolve without npx.
  • In-memory log ring per process. Fixed capacity (default 10 000 lines/process), O(1) push, tail filter does a single backward scan.
  • JSON log enrichment. Each line is probed for JSON. If it parses, the level field is normalized (pino's numeric 30/40/50 → info/warn/error; bunyan/winston/structlog strings; warningwarn; criticalfatal) and msg is extracted for display.
  • Structured log filtering. level, min_level, grep (text regex), where (AND of field=value, field~=regex, field exists clauses over parsed JSON), since (ISO or 30s/5m/1h), stream (stdout/stderr/both), tail, and four output formats (auto/pretty/raw/json).
  • Pretty-render. Default output is two-line pino-style:
    14:02:33.120  info   [api/out]  request completed
                  req.method=GET  req.url=/health  status=200
    
  • Cross-platform tree-kill. POSIX process groups + Windows taskkill /T. Stopping npm run x kills the node it spawned.
  • proc_wait_ready. Block on URL probe and/or regex match in logs. Fails fast if the process dies mid-wait (returns tail of stderr).
  • proc_exec for one-shots. Bounded output, stdin input, tree-kill timeout.
  • proc_signal for SIGHUP/SIGUSR1/etc. Deliver a signal without treating it as a stop — registry status stays running. Works on the whole tree by default.
  • proc_stdin for interactive processes. Start with stdin: "pipe" and feed arbitrary data (REPL commands, prompt answers, pipe input) with optional EOF close.
  • Port ownership. proc_port_who maps a TCP port to a PID (via lsof or netstat/tasklist) and correlates with our registry.
  • Idle TTL per process. Opt-in idle_ttl_ms — kills a process after N ms without log output. Useful for scratch processes you might forget.
  • Grave TTL. Exited entries (and their logs) are preserved for 1 h by default, so you can post-mortem after a crash.
  • Resource caps. max_sessions, max_processes, total log memory, per-line size, per-tool string length limits. Hard stops, with the counter in error messages.
  • ANSI stripping. Escape sequences (CSI, OSC, SS, charset) are removed at ingest so grep works with the visible text.
  • Per-name serialization. Concurrent proc_start/proc_stop/proc_restart/ proc_remove on the same name are serialized via a promise-chain lock — no orphaned children from races.
  • CSRF-hardened HTTP. Content-Type: application/json required, Origin header whitelist, API key compared with timingSafeEqual.
  • Refuse-to-start safety. Bound to a non-loopback interface without an API key? Process exits with code 2 and an explanation. Override with --allow-insecure if you know what you're doing.
  • Health endpoint. GET /health returns uptime, session count, process counts, config summary. No auth, safe to probe.
  • Multi-arch Docker image. linux/amd64 + linux/arm64, non-root user, tini as PID 1 for proper zombie reaping.
  • CI on Ubuntu + macOS + Windows × Node 24 × every push.

Tools reference

All tools accept structured arguments (zod-validated, OpenAPI-compatible schemas). Responses are single-block text content — the format is designed to be readable by an LLM and by a human looking at curl output.

proc_start

Spawn a long-running process and register it.

Schema:

  • name: string (1–64) — unique identifier, used by every other proc_* tool.
  • cmd: string | string[] — command to run. String (≤ 8192) is parsed by execa's shell-style splitter without invoking a shell. Array [binary, ...args] is passed unparsed (use this for arguments with spaces). For pipes/redirects pass ["/bin/sh", "-c", "foo | bar"] (POSIX) or ["cmd", "/c", "foo | bar"] (Windows).
  • cwd?: string (≤ 4096) — working directory; defaults to supervisor's cwd.
  • env?: Record<string, string> — merged on top of the supervisor's env.
  • port?: number (1–65535) — informational; shown in proc_list and proc_port_who. Not enforced.
  • idle_ttl_ms?: number — kill the process if it produces no log output for this many ms. Off by default.
  • if_running?: "error" | "reuse" | "restart" — behavior when the name is already running. Default error.

Returns: summary with pid, cwd, cmd, started_at, and — if the process died within the 500 ms settle window — the last 20 log lines and isError: true.

proc_stop

Stop by name.

Schema: name, signal: "SIGTERM"|"SIGINT"|"SIGKILL"|"SIGHUP" (default SIGTERM), timeout_ms (default 5000).

Sends the signal to the whole process tree. If alive after the timeout, escalates to SIGKILL. Idempotent — stopping an exited process is a no-op.

proc_restart

Stop + start with the same cmd/cwd/env/port/idle_ttl.

Schema: name, timeout_ms (default 5000).

proc_list

Table of all entries.

Schema: status: "running"|"exited"|"all" (default all).

Example output:

NAME  STATUS   PID    PORT  UPTIME  IDLE             EXIT
api   running  12345  3000  12m     -                -
bg    running  12346  -     2m      30s / 2m         -
lint  exited   -      -     -       grave 54m        0

proc_remove

Evict an entry, releasing its name and log buffer.

Schema: name, force: boolean (default false).

By default only works on exited entries. force: true stops then removes.

proc_logs

Query a process's ring buffer.

Schema:

  • name: string — required.
  • tail: number (1–10000, default 200) — max lines returned.
  • since: string — ISO timestamp or relative (30s/5m/1h/2d).
  • grep: string (≤ 512) — regex over the raw line text.
  • stream: "out"|"err"|"both" — default both.
  • level: level | level[] — JSON-aware; skips non-JSON lines when set.
  • min_level: level — shortcut for "this level and above". Mutually exclusive with level (last one wins — level takes precedence).
  • where: string (≤ 1024) — structured predicate over parsed JSON. Comma-separated AND. Forms per clause:
    • a.b.c=value — exact string equality
    • a.b.c~=regex — regex match
    • a.b.c — field exists (any value except undefined)
  • format: "auto"|"pretty"|"raw"|"json" — output style.

Output formats:

format JSON line Non-JSON line
auto (default) pretty, two-line plain with timestamp + stream
pretty same as auto same as auto
raw [ts out/err] text same
json one envelope per line with ts, stream, text, level, msg, json, truncated same

Header includes counters: "N lines shown out of K buffered; X older dropped, T total written". The "dropped" counter makes ring-buffer overflow visible.

proc_wait_ready

Block until a process is ready.

Schema: name, url? (poll), expect_status: number | number[] (default 2xx/3xx), pattern?: string (≤ 512) (regex in logs), timeout_ms: number (default 30000), interval_ms: number (default 200, for URL probe only).

At least one of url or pattern must be set. Both run concurrently, first to succeed wins. Fails fast with the tail of stderr if the process dies mid-wait.

proc_port_who

Who owns this TCP port.

Schema: port: number (1–65535).

Uses lsof -iTCP:<port> -sTCP:LISTEN on POSIX, netstat -ano -p TCP + tasklist /NH /FO CSV on Windows. Correlates PID with our registry:

Port 3000:
  in use by pid=12345 → proc_mcp name: api (node)

proc_exec

One-shot command, return output to completion.

Schema: cmd, cwd, env, timeout_ms (default 30000, max 600000), input?: string (≤ 1 MB) (written to stdin, then closed), max_output_bytes (per-stream cap, default 1 MB, keeps the last N bytes on overflow).

Not added to the registry. Timeout uses our tree-kill so grandchildren die.

Example response:

Exit: 0   duration: 248ms   cmd: npm test

── stdout (1234 bytes) ──
PASS  src/foo.test.ts
...

── stderr (empty) ──

On timeout:

Exit: TIMEOUT   duration: 30000ms   cmd: long-running-thing

── stdout (8k bytes, TRUNCATED — showing last 200 lines) ──
...

Note: process tree killed by SIGKILL after 30000ms timeout.

proc_signal

Send an arbitrary signal to a running process without treating it as a stop. Registry status stays running, no escalation to SIGKILL.

Schema: name, signal: "SIGHUP"|"SIGINT"|"SIGQUIT"|"SIGUSR1"|"SIGUSR2"|"SIGTERM"|"SIGWINCH"|"SIGTSTP"|"SIGCONT", tree: boolean (default true).

Typical uses:

  • SIGHUP — config reload in many demons (nginx, unicorn).
  • SIGUSR1 / SIGUSR2 — app-specific hooks (nginx log reopen, node debug toggle).
  • SIGWINCH — inform the process that the terminal resized.

For stopping, use proc_stop — this tool doesn't wait for exit and doesn't change the registry status.

proc_stdin

Write data to a running process's stdin. The process must have been started with stdin: "pipe".

Schema:

  • name: string — required.
  • data: string (≤ 1 MB) — bytes to write (UTF-8). Empty string + end=true just closes stdin without data.
  • end: boolean (default false) — close stdin (EOF) after writing. The process usually detects EOF and finishes; further writes error out.
  • append_newline: boolean (default false) — convenience for line-oriented protocols ("ls\n" instead of "ls" + separate newline character).

Errors:

  • "not running" — the process has already exited.
  • "stdin=ignore" — started without stdin: "pipe"; restart to enable.
  • "already closed" — you already sent end=true.

Example — drive a Node REPL:

proc_start { name: "repl", cmd: ["node"], stdin: "pipe" }
proc_stdin { name: "repl", data: "2+2", append_newline: true }
proc_logs  { name: "repl", tail: 1 }     # see "4" in output
proc_stdin { name: "repl", data: "", end: true }   # goodbye

Configuration

All flags are optional — loopback-only defaults work out of the box.

Flag Env Default Notes
-p, --port PROC_MCP_PORT 7778 valid 1–65535
-H, --host PROC_MCP_HOST 127.0.0.1 see security model
--api-key PROC_MCP_API_KEY (off) required when host ≠ loopback
--allow-insecure PROC_MCP_ALLOW_INSECURE false override refuse-to-start
--cors-origin PROC_MCP_CORS_ORIGIN null comma-separated origins or *
--max-sessions PROC_MCP_MAX_SESSIONS 100 hard cap on concurrent MCP sessions
--session-ttl PROC_MCP_SESSION_TTL_SEC 1800 idle session reaper (30 min)
--max-processes PROC_MCP_MAX_PROCESSES 200 cap on registry entries
--log-lines PROC_MCP_LOG_LINES 10000 ring capacity per process
--log-line-max PROC_MCP_LOG_LINE_MAX 10000 bytes per line before truncation
--log-total-mb PROC_MCP_LOG_TOTAL_MB 200 global soft cap; biggest ring shrinks
--grave-ttl PROC_MCP_GRAVE_TTL_SEC 3600 keep exited entries for 1 h
--sweep-interval PROC_MCP_SWEEP_INTERVAL_SEC 30 TTL sweeper tick
--spawn-settle-ms PROC_MCP_SPAWN_SETTLE_MS 500 proc_start early-exit window

Any env var can also be passed on the command line (long-form name becomes --kebab-case). process-mcp --help prints the full list.


Architecture

Process topology

process-mcp is one Node process. All state is in memory. There is no database, no spill-to-disk, no cross-version schema. If you kill the supervisor you lose the registry and logs; children it supervised are killed too (tree-kill on shutdown).

HTTP layer

Transport: @modelcontextprotocol/sdk's StreamableHTTPServerTransport on top of node:http. One TCP listener, one endpoint /mcp, plus /health.

Each MCP client gets a session on initialize (random UUID in mcp-session-id). Sessions have their own McpServer instance and transport. Idle sessions are reaped after session_ttl via a 60 s interval timer.

Hard caps: max_sessions (503 on overflow), MAX_BODY_BYTES = 1 MB per request, per-tool zod .max(…) on every user string.

Request pipeline for /mcp:

→ URL check (startsWith /mcp)
→ Origin check (allowlist; unset Origin = native client, always allowed)
→ Content-Type check (POST requires application/json)
→ Auth check (Bearer, timingSafeEqual)
→ session lookup / create (cap applied on create)
→ MCP SDK transport.handleRequest

Working directory (cwd)

Every spawn — proc_start and proc_exec — resolves cwd to an absolute path at spawn time, then validates it exists and is a directory. Behavior:

  • Absolute path (/Users/me/project) — used as-is.
  • Relative path (./src, ../other) — resolved against the supervisor's current working directory (process.cwd()), not against anything else.
  • Not provided — defaults to the supervisor's process.cwd(). Shown in the boot log as default cwd → … so you always know where undirected calls land.
  • Doesn't exist / is a file / not accessible — fails upfront with a clear message (Error: cwd does not exist: /foo), not a cryptic ENOENT from execa.

Stored cwd in the registry is always the resolved absolute path, so proc_list / formatEntrySummary show a canonical copy-pastable location, and proc_restart is immune to the supervisor's own cwd changing between calls.

In Docker the default cwd is the image's WORKDIR (/app). User cwd values must point to paths that exist inside the container — host paths like /Users/... won't work.

Registry

Map<name, ProcEntry>

One entry per process, keyed by user-chosen name. An entry holds the spawn config (so restart is self-contained), the execa Subprocess handle, the LogRing, and lifecycle timestamps (startedAt, lastOutputAt, exitedAt, exitCode, signal).

States: runningexited (one-way). exited entries stick around until grave-TTL or explicit proc_remove.

Serialization. Every state-mutating operation on a name acquires a per-name promise-chain lock (src/lib/lock.ts). This prevents the classic race where two concurrent proc_restart calls orphan a child because both read the old entry, both spawn a new one, only one ends up in the registry.

proc_start(api) ──┐
                  ├── withLock("proc:api") ── sequential
proc_restart(api) ┘

Different names are independent — no lock contention across processes.

Log pipeline

child.stdout  ──┐
                ├── line-splitter (buffers partial, splits on \n, handles CRLF)
child.stderr  ──┘       │
                        ▼
                 strip ANSI (CSI/OSC/SS)
                        │
                        ▼
                 truncate at log_line_max bytes (mark `truncated: true`)
                        │
                        ▼
                 try JSON.parse                        ── if success:
                        │                                  ↓
                        │                              extract `level` (pino numeric, strings, aliases)
                        │                              extract `msg` / `message` / `event`
                        ▼
                 LogRing.push ──► emit("line") ──► subscribers (proc_wait_ready)

Ring buffer

Fixed-size circular buffer. O(1) push; overwrites oldest on wrap. Tracks total (cumulative writes) independently of size (currently buffered), so proc_logs can tell users "X older dropped".

readTail iterates newest→oldest applying since (breaks early — timestamps monotonic in push order), stream, level, where, grep in that order, stopping when tail items are collected. Result reversed to chronological.

JSON where-DSL

Parsed at query time, not at ingest. Per clause:

  • a.b.c=valueString(getPath(json, path)) === value
  • a.b.c~=regexnew RegExp(regex).test(String(getPath(json, path)))
  • a.b.cgetPath(json, path) !== undefined

Paths validated against /^[A-Za-z_$][A-Za-z0-9_$]*(\.…)*$/ — typos give clear errors like clause 2: invalid field path "req..url".

TTL sweeper

One setInterval (every sweep_interval_sec, default 30 s) does three jobs:

  1. Grave-reap exited entries whose exitedAt + grave_ttl has passed.
  2. Idle-kill running entries whose lastOutputAt + idle_ttl_ms has passed (opt-in via proc_start { idle_ttl_ms }).
  3. Log-memory cap. If total bytes across all rings > log_total_mb, shrink the biggest ring by half. Up to 3 passes per tick.

Shutdown

SIGINT/SIGTERM triggers a controlled sequence:

1. stop accepting new HTTP connections (httpServer.close)
2. stop sweeper + session reaper
3. wait up to 2 s for in-flight HTTP handlers to drain
4. close all sessions
5. tree-kill all running children (stopLocked with 3 s timeout each)
6. process.exit(0)

Step 3 is a guard against in-flight proc_start calls missing the killlist.

Crash-recovery reaper

On every registry change, a state file is written (debounced 200 ms) to:

  • POSIX: $TMPDIR/process-mcp-<uid>/<port>.json
  • Windows: %TEMP%\process-mcp\<port>.json
  • Override with PROC_MCP_STATE_DIR.

Contents: supervisor PID, our start time, and the list of {pid, name, started, cmd} for every child currently running. On graceful shutdown the file is deleted.

When a supervisor starts, before accepting connections it reads any previous state file at its port:

  1. If the previous supervisor PID is still alive → we refuse to reap anything. Two instances on the same port is a collision; the subsequent listen() will fail with EADDRINUSE, which is the correct surface.
  2. Otherwise, for each recorded child: kill -0 probe; if alive, tree-kill it (SIGTERM, 500 ms grace, SIGKILL). Log a summary: reaper: reaped 2 orphans from previous run.
  3. Delete the state file and proceed.

What this buys you: after kill -9 <supervisor> / OOM / unhandled exception, the next supervisor start cleans up. No more "port 3000 still in use" mystery two minutes after a crash.

What it doesn't buy you: the orphans stay alive in the window between crash and next start. True "die-with-parent" semantics need PR_SET_PDEATHSIG (Linux-only, needs a C shim) or JobObject (Windows-only). We don't ship either for cross-platform simplicity — if you care, run the supervisor under systemd --user with Restart=always so the gap is a couple of seconds.

PID reuse caveat: between crash and reap the OS can recycle a PID to a different process. We don't have a cheap cross-platform way to verify (Linux could read /proc/<pid>/stat start time; macOS/Windows need more work). In practice false-positive reaps require reboot + PID exhaustion + the same PID landing on a different supervisor's unrelated child, all within your chosen TMPDIR lifetime — vanishingly unlikely.

Environment isolation

Before spawning any child, PROC_MCP_* variables are filtered out of the inherited env. This means PROC_MCP_API_KEY, PROC_MCP_CORS_ORIGIN, the port, host, and every cap are NOT visible to your dev servers. They stay inside the supervisor. The user's explicit env: additions pass through unchanged.

We also set extendEnv: false on execa, otherwise it would merge process.env back in and undo the filtering.


Security model

The server executes arbitrary commands on behalf of anyone who reaches /mcp. The defaults are chosen to make that safe; deviating from them requires explicit opt-in.

Refuse-to-start

If host ≠ 127.0.0.1/::1/localhost and no API key is set, the process exits with code 2 and an explanation. The only way past this is --allow-insecure, which is a deliberate opt-in.

CSRF defense

Three layers prevent a malicious website from invoking /mcp via a victim's browser:

  1. Content-Type: application/json required on POST. This type is not CORS-simple, so browsers must do a preflight (OPTIONS). We don't answer preflights, so the fetch is blocked before it reaches us.
  2. Origin allowlist. If the request has an Origin header, it must match cors_origin (default null — only no-Origin clients like curl and Claude Code are allowed). Set to an explicit list or * to open up.
  3. Bearer API key compared with crypto.timingSafeEqual so token guessing can't be accelerated by short-circuit string comparison.

Resource caps

  • max_sessions (default 100) — cap on concurrent MCP sessions.
  • max_processes (default 200) — cap on registry entries.
  • log_total_mb (default 200 MB) — global log memory soft cap.
  • log_line_max (default 10 KB) — per-line truncation.
  • MAX_BODY_BYTES (1 MB) — per-request body cap.
  • Per-tool zod .max(…) on every user-supplied string (cmd 8 KB, cwd 4 KB, env values 8 KB each, grep/pattern 512 B, where 1 KB, input 1 MB).

No shell, ever

proc_start and proc_exec never invoke /bin/sh for you. String cmd is parsed by execa's shell-like splitter (tokens + quotes, no variables/pipes/ redirects). Want a shell? Pass it explicitly: ["/bin/sh", "-c", "foo | bar"]. This removes an entire class of injection bugs.

Tree-kill

Children are spawned in their own process group on POSIX (detached: true, PGID = PID). Stop signals the group (kill -pgid). On Windows taskkill /pid <pid> /T walks the child tree.

ReDoS surface (residual risk)

grep, pattern, and where's regex form pass user-supplied regexes to new RegExp. Node's backtracking engine is vulnerable to catastrophic patterns like (a+)+b. Length caps (512 B / 1 KB) reduce the surface but don't eliminate it. Don't expose process-mcp to untrusted callers.

Non-root Docker

The image runs as node:node (UID 1000), not root. tini is PID 1 for zombie reaping. A healthcheck pings /health every 30 s.

Health endpoint

GET /health is unauthenticated and does not reveal command history, log contents, or session IDs. It returns:

{
  "status": "ok",
  "uptime_ms": 123456,
  "sessions": 2,
  "processes": { "total": 5, "running": 4, "exited": 1 },
  "config": { "host": "127.0.0.1", "port": 7778, "auth": "on" }
}

Platform notes

  • macOS, Linux. Full support. proc_port_who uses lsof. Tree-kill via POSIX process groups.
  • Windows. Full support. proc_port_who uses netstat -ano + tasklist. Tree-kill via taskkill /T. No shell is ever invoked — ["cmd", "/c", "…"] if you need one.
  • Docker (node:24-slim base). lsof, procps, tini, ca-certificates preinstalled. Multi-arch (amd64 + arm64). Images at ghcr.io/graph-memory/process-mcp on each version tag.
  • Processes you start in Docker run inside the container, not on the host. If you need host tooling (your project's npm, go, cargo) install it in a derived image or run the server directly on the host.

Docker usage

# Generate a key once, reuse it.
export PROC_MCP_API_KEY=$(openssl rand -hex 32)

docker run --rm -p 7778:7778 -e PROC_MCP_API_KEY \
  ghcr.io/graph-memory/process-mcp:latest

# Or compose. Put PROC_MCP_API_KEY in .env next to docker-compose.yml.
docker compose up

The container binds to 0.0.0.0 and refuses to start without an API key. That's intentional: exposing /mcp unauthenticated on a routable interface is remote-code-execution-as-a-service.


Development

git clone https://github.com/graph-memory/process-mcp
cd process-mcp
npm install

npm run dev                              # tsx, auto-reload
npm run build && npm start               # production build

npx tsc --noEmit                         # typecheck only

Source layout

src/
  index.ts              HTTP server, session mgmt, CSRF, auth, /health, shutdown
  config.ts             commander + env parsing, validation, hostIsLoopback
  log.ts                supervisor's own logger (pino-style stderr)
  registry.ts           Map<name, ProcEntry>, spawn/stop/restart/remove, sweeper
  logs.ts               LogRing, line-splitter, JSON enrichment, where DSL
  lib/
    auth.ts             safeStringEq (timingSafeEqual wrapper)
    kill.ts             killTree (POSIX pgroup + Windows taskkill)
    lock.ts             withLock — per-key promise-chain serialization
    state.ts            on-disk state file (stateFilePath, read/write/remove, isProcessAlive)
    reaper.ts           startup reaper: tree-kill orphans from previous crash
  tools/
    common.ts           ToolResult helpers, formatEntrySummary, uptime
    start.ts            proc_start
    stop.ts             proc_stop
    restart.ts          proc_restart
    list.ts             proc_list (fixed-width table)
    remove.ts           proc_remove
    signal.ts           proc_signal (send arbitrary signal without stopping)
    stdin.ts            proc_stdin (write to stdin of processes started with stdin: "pipe")
    logs.ts             proc_logs (filters, 4 formats, pretty-renderer)
    wait_ready.ts       proc_wait_ready (URL + pattern, death-fast-fail)
    port_who.ts         proc_port_who (lsof / netstat+tasklist)
    exec.ts             proc_exec (BoundedBuffer, tree-kill timeout)

CI

.github/workflows/:

  • ci.yml — matrix {ubuntu, macos, windows} × node 24: npm ci, npm run build, boot the compiled dist, hit initialize via curl.
  • docker.yml — on v* tag, builds & pushes multi-arch image to ghcr.io/graph-memory/process-mcp.
  • publish.yml — on v* tag, publishes @graphmemory/process-mcp to npm with provenance.

Why no tests?

Intentional for 0.1. Integration tests are valuable but expensive to write for a supervisor (need to spawn real processes, exercise platform-specific code). All features were manually smoke-tested end-to-end through HTTP (see docs/testing-notes.md for the script if present). Contributions adding a vitest harness are welcome.


FAQ

Q: Does the registry survive a supervisor restart? A: No. In-memory only. On graceful shutdown (SIGTERM/SIGINT) the supervisor tree-kills every child before exiting. On a crash (SIGKILL / OOM) it doesn't get the chance, but the next supervisor start reaps surviving children via the state file in $TMPDIR/process-mcp-<uid>/ — see "Crash-recovery reaper" in the Architecture section. The gap between crash and next start is the only window where orphans linger; pair process-mcp with systemd --user Restart=always if you want that gap closed to seconds.

Q: Can I run multiple process-mcp instances? A: Yes, on different ports. They share nothing. Useful for isolating projects or separating "stable" and "scratch" namespaces.

Q: Does it support Docker-in-Docker / nested containers? A: Not specifically. But you can proc_start a docker run or docker compose up command — that's a regular foreground process from our perspective. For managing pre-existing containers we'd need dedicated tools (not in V1 — see docs/adr/ when it's written).

Q: What happens when a process emits 100 MB/s of logs? A: Each line hits the ring buffer in O(1); oldest lines are dropped silently and reported in proc_logs' header. The global log_total_mb cap kicks in every 30 s and halves the biggest ring. You won't OOM, but you might lose logs faster than you expect — tune log_lines higher for chatty processes.

Q: My npm run x script spawns grandchildren. Are they killed on proc_stop? A: Yes, as long as they inherit the process group. Scripts that deliberately setsid or detach their own children break this guarantee (by design — the kernel can't reach them). In practice, >99% of dev tools work.

Q: Can I stream logs in real time? A: Not as a tool. proc_wait_ready { pattern: "..." } subscribes to new lines internally for the ready-check, but there's no proc_tail_stream tool in V1 — the MCP text-content model doesn't fit streaming well. Poll proc_logs with since instead.

Q: Why is auth off by default? A: The default host is 127.0.0.1 (loopback). Only you can reach it, and you already have the same privileges as the server process. Adding mandatory auth there is friction without a threat model. On non-loopback hosts, auth becomes mandatory (refuse-to-start).

Q: Is this safe to expose over Tailscale / WireGuard? A: Yes — set PROC_MCP_API_KEY, and process-mcp will accept bind to 0.0.0.0 (refuse-to-start check passes). On the client, use Bearer auth. The tunnel adds transport encryption on top.


License

Elastic License 2.0. See LICENSE.