Skip to content

SaladDay/tui-pilot

Repository files navigation

tui-pilot

Give your AI agent eyes and hands inside any terminal UI.

macOS only Node.js >= 20.19 TypeScript strict MCP compatible

中文文档

AI controlling a terminal UI application

Think Playwright, but for terminal apps instead of browsers.

tui-pilot is an MCP server that lets AI agents launch, observe, and interact with real terminal applications on macOS. It runs the target app inside tmux, renders it in a real terminal window, captures pixel-perfect PNG screenshots via macOS native APIs, and exposes everything through a clean set of MCP tools.

No ANSI re-rendering. No fake terminal emulation. What your agent sees is exactly what a human would see.

Architecture

tui-pilot architecture: MCP Server → tmux / Terminal / macOS

Three planes work together under a unified MCP interface:

Plane Backed by Responsibility
Control tmux Session lifecycle, key dispatch, text capture
Render WezTerm / Ghostty Real terminal rendering with GPU acceleration
Screenshot Swift + CoreGraphics Native window discovery and pixel-perfect PNG capture

Workflow

Preflight → Launch → Snapshot ↔ Interact → Cleanup

Real-world tui-pilot session with agent reasoning on the left and the live cc-switch terminal window on the right

A real session: OpenCode drives cc-switch through tui-pilot, with the agent trace on the left and the live terminal window on the right.

Step Tool What happens
Preflight tui_doctor Verify dependencies & permissions
Launch tui_start Create a tmux session + attach a terminal window
Snapshot tui_snapshot Capture plain text + ANSI text + real PNG screenshot
Interact tui_send_keys / tui_type Send keystrokes or type text
Cleanup tui_stop Graceful session teardown

Steps 3–4 form an observe-act loop: snapshot the current state, decide on input, send it, then snapshot again — as many times as needed.

Tools

Tool Description
tui_doctor Inspect dependencies, backend selection, GUI heuristics, and permission checks
tui_start Start a tmux-backed session and attach a new terminal window
tui_send_keys Send named key presses — Down, Up, Enter, Escape, etc.
tui_type Send literal text via tmux send-keys -l
tui_snapshot Capture plain text, ANSI text, and a PNG screenshot in one call
tui_stop Stop the tmux session and release all resources

Tip

Run tui_doctor first if tui_start or tui_snapshot fails. It will tell you which backend was selected and remind you to grant Screen Recording permission.

Requirements

  • macOS with an active GUI session
  • Node.js 20.19+
  • tmux
  • WezTerm or Ghostty
  • swiftc (ships with Xcode Command Line Tools)
  • Screen Recording permission for the app that launches tui-pilot

Note

If screenshots fail with permission errors, grant Screen Recording to whichever app is spawning the server — Terminal, iTerm, or your MCP client.

Getting started

1. Ask your AI to install it (recommended)

If you use Claude Code, OpenCode, or another MCP-aware agent, paste one of these prompts.

MCP only

Install `tui-pilot` into my MCP client. Use `/absolute/path/to/tui-pilot` as the project path, build anything that is needed, register it as a local stdio MCP server, and then run `tui_doctor`. Do not install the optional skill.

MCP + optional skill

Install `tui-pilot` into my MCP client. Use `/absolute/path/to/tui-pilot` as the project path, build anything that is needed, register it as a local stdio MCP server, and also install the optional local skill `tui-pilot-visual-check` if my client supports skills. After setup, run `tui_doctor` and tell me what still needs manual approval.

Install the optional skill only if your client supports local skills.

Other installation options

Build the server

npm install
./scripts/build-window-helper.sh
npm run build

Register the MCP server

Point your MCP client at the built server:

node /absolute/path/to/tui-pilot/dist/index.js

For development, you can point the client at npm run dev instead. In clients that store commands and args separately, enter that as command npm with args run and dev.

OpenCode example:

{
  "mcp": {
    "tui-pilot": {
      "type": "local",
      "enabled": true,
      "command": ["node", "/absolute/path/to/tui-pilot/dist/index.js"],
      "timeout": 30000
    }
  }
}

Claude Desktop example:

{
  "mcpServers": {
    "tui-pilot": {
      "command": "node",
      "args": ["/absolute/path/to/tui-pilot/dist/index.js"]
    }
  }
}

If you want to force a render backend for this server process, set TUI_PILOT_TERMINAL_BACKEND to wezterm or ghostty in your client config.

Optional skill

The repo includes a local skill at .agents/skills/tui-pilot-visual-check.

mkdir -p ~/.config/opencode/skills
cp -R .agents/skills/tui-pilot-visual-check ~/.config/opencode/skills/

Restart the MCP client, or open a new session, so it reloads the MCP config and optional skill.

Verify the install

Run tui_doctor with no arguments.

Confirm:

  • automaticChecksPassed is true
  • backend.selected is the terminal backend you expect
  • manualChecksRequired includes screen-recording

tui_doctor does not verify Screen Recording permission automatically, so do one live check with tui_snapshot after setup.

Usage

Development mode (auto-reloads on save):

npm run dev

Production mode:

npm run build
node dist/index.js

Backend selection

tui-pilot auto-detects a render backend in this order: WezTerm → Ghostty.

Override with an environment variable:

TUI_PILOT_TERMINAL_BACKEND=ghostty npm run dev

Supported values: auto, wezterm, ghostty.

Quick example

The repo includes fixtures/mini-tui.ts, a keyboard-driven menu for testing:

1. tui_doctor      → confirm automaticChecksPassed is true
2. tui_start       → launch the fixture app
3. tui_snapshot    → read textView + inspect the PNG
4. tui_send_keys   → send "Down"
5. tui_snapshot    → confirm the selection moved
6. tui_stop        → clean up

Screenshots and helper binaries live under .tui-pilot/.

Testing

npm test              # run all tests
npm run typecheck     # type-check without emitting

Roadmap

  • macOS support (WezTerm / Ghostty)
  • Linux support (X11 / Wayland screenshot backends)
  • Windows support (Windows Terminal + native capture)

Note

Cross-platform support for Linux and Windows is planned. Stay tuned!

Further reading

About

A small MCP server for driving terminal UIs through a real macOS terminal window.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors