Standalone local computer-use MCP server for macOS.
The current capture flow is optimized for MCP clients that can consume inline image attachments. Saved local image paths remain available through capture_metadata when a client wants file-based follow-up such as view_image(imagePath).
This repository implements:
- standalone MCP server
- stdio transport first
- session-owned state
- desktop lock
- permission and app approval coordination
- native host seam with Swift bridge services and a selectable input backend (
swift,rust,fake) - fake mode for development and testing on non-macOS hosts
screenshot and zoom now do two things directly:
- attach the captured image inline in the MCP
contentarray - return a text item containing
captureId=...
Geometry and saved-file metadata are retrieved separately through capture_metadata.
The intended consumer flow is:
- call
screenshotorzoom - inspect the inline MCP image attachment
- if geometry or file metadata is needed, call
capture_metadata(captureId)
For clients that prefer file-based image viewing, the fallback flow is:
- call
screenshotorzoom - extract the
captureIdfrom the text item - call
capture_metadata(captureId) - read
structuredContent.imagePath - call
view_image(imagePath)if needed
The server does not expose capture delivery through MCP resources/read. Screenshot image data is delivered as an MCP image content item, not as plain base64 text in structuredContent.
- TypeScript MCP server
- protocol-compatible stdio JSON-RPC transport
- Streamable HTTP transport
- full current tool surface:
request_accessscreenshotlist_displaysselect_displayzoomcapture_metadatacursor_positionmouse_moveleft_clickright_clickmiddle_clickdouble_clicktriple_clickleft_click_dragscrollkeyhold_keytyperead_clipboardwrite_clipboardsearch_applicationsopen_applicationlist_granted_applicationswaitcomputer_batch
- file-backed desktop lock
- session store
- fake native mode
- unit tests and transport end-to-end tests using Node's built-in test runner
-
Swift native bridge executable (
ComputerUseBridge) for:- ScreenCaptureKit screenshots
- TCC checks and System Settings deep links
- NSWorkspace app operations
- CGEvent-based mouse and keyboard injection (legacy fallback input path)
- NSPasteboard clipboard access
- running app and window/display inspection
-
Rust input backend package (
@agenai/native-input) for:- optional desktop input routing via
COMPUTER_USE_INPUT_BACKEND=rust - mouse, key, type, and scroll injection through a local N-API addon
- optional desktop input routing via
- approval UI bridge package (
ApprovalUIBridge) for local request prompts - host SDK stubs
The TypeScript server has a runnable fake mode for development and testing.
The real macOS path requires:
- the Swift bridge (
ComputerUseBridge) for screenshots/apps/TCC/clipboard/hotkeys - the approval helper (
ApprovalUIBridge) for localrequest_accessprompts - optionally, the Rust input addon when selecting
COMPUTER_USE_INPUT_BACKEND=rust
agent / MCP client
-> stdio MCP server (TypeScript)
-> tool registry + session store + approval coordinator + desktop lock + native host selector
-> Swift bridge client (screenshots/apps/TCC/clipboard/hotkeys)
-> input backend (swift|rust|fake)
-> ComputerUseBridge (Swift executable) OR @agenai/native-input (Rust addon)
- Swift remains the default integrated native bridge path for broad macOS surface area
- Rust input is available as an additive backend choice for input-focused iteration
- the Node server keeps the MCP surface thin
- Swift helper executable owns AppKit / ScreenCaptureKit / CoreGraphics interactions outside input-addon overrides
- the Node process does not need a Cocoa run-loop pump because native work happens in the helper process
npm run build
npm testswift build --package-path packages/native-swift -c release
swift build --package-path packages/approval-ui-macos -c releasenpm --prefix packages/native-input run build
npm --prefix packages/native-input testCOMPUTER_USE_FAKE=1 node dist/computer-use-mcp/src/main.jsnode dist/computer-use-mcp/src/main.jsOptional environment variables:
COMPUTER_USE_FAKE=1COMPUTER_USE_LOCK_PATH=/custom/path/desktop.lockCOMPUTER_USE_CAPTURE_ASSET_ROOT=/custom/path/capturesCOMPUTER_USE_SWIFT_BRIDGE_PATH=/absolute/path/to/ComputerUseBridgeCOMPUTER_USE_APPROVAL_UI_PATH=/absolute/path/to/ApprovalUIBridgeCOMPUTER_USE_INPUT_BACKEND=swift|rust|fakeCOMPUTER_USE_RUST_INPUT_PATH=/absolute/path/to/native-input.node
docs/— historical planning/spec documents plus current capture contract notepackages/computer-use-mcp/— TypeScript serverpackages/native-swift/— real macOS bridge executablepackages/approval-ui-macos/— local approval helper executablepackages/host-sdk/— host callback contract stubspackages/native-input/— Rust N-API input backend package