A Model Context Protocol (MCP) server that enables AI coding agents to capture screenshots of macOS windows and displays on demand.
LLMs are amazing at using images for context. You can feed image files to an LLM and it can do things like analyze a design or read text. I find myself constantly wanting to "show" LLMs what I'm looking at, but I found it cumbersome to take a screenshot, find the file, and give the path to the LLM. Additionally I ended up with thousands of screenshots over time that I needed to manage. So I thought, why can't the LLM just do this itself? And that's what led to this project.
- Window Discovery - List all open windows with metadata (title, app, bounds, display)
- Window Capture - Capture screenshots of specific windows by ID
- Display Capture - Capture entire displays (single or all)
- Smart Filtering - Automatically filters out system overlays and utility windows
- Natural Integration - Works seamlessly with any MCP-compatible AI agent
- Privacy First - Runs entirely locally on your Mac
- Professional Logging - Structured logging with timestamps for debugging
- macOS: 12.0+ (Monterey or later)
- Architecture: Intel (x64) or Apple Silicon (arm64)
- Node.js: 16.0.0 or higher
- Permissions: Screen Recording permission required
npm install -g mac-vision-mcpnpx -y mac-vision-mcpOn first run, macOS will prompt you to grant Screen Recording permission:
- Open System Preferences
- Go to Privacy & Security > Screen Recording
- Enable permission for the application running the MCP server
- Restart the MCP server
Add to .mcp.json in your project:
{
"mcpServers": {
"mac-vision": {
"command": "npx",
"args": ["-y", "mac-vision-mcp"]
}
}
}Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"mac-vision": {
"command": "npx",
"args": ["-y", "mac-vision-mcp"]
}
}
}Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"mac-vision": {
"command": "npx",
"args": ["-y", "mac-vision-mcp"]
}
}
}Once configured, your AI agent can use natural language to capture screenshots:
User: "Show me my Chrome window with the error"
Agent: [calls list_windows]
Agent: [calls capture_window with the Chrome window ID]
Agent: "I can see the 404 error in your browser..."
Get all open windows with metadata.
Parameters: None
Returns:
{
"windows": [
{
"id": "12345",
"title": "Chrome - Documentation",
"app": "Google Chrome",
"bounds": {
"x": 0,
"y": 23,
"width": 1920,
"height": 1057
},
"display": 0
}
]
}Capture a screenshot of a specific window.
Parameters:
window_id(required, string) - Window ID fromlist_windowsmode(optional, string) - Capture mode:"full"or"content"(default:"full")output_path(optional, string) - Custom output path (must end with.png)
Returns:
{
"success": true,
"file_path": "/tmp/screenshot_12345.png",
"window": {
"id": "12345",
"title": "Chrome - Documentation",
"app": "Google Chrome"
}
}Capture screenshots of multiple windows at once. Useful when you need to see several windows simultaneously.
Parameters:
window_ids(required, string[]) - Array of Window IDs fromlist_windowsmode(optional, string) - Capture mode:"full"or"content"(default:"full")output_dir(optional, string) - Custom output directory (default: temp directory)
Returns:
{
"success": true,
"captures": [
{
"window_id": "12345",
"success": true,
"file_path": "/tmp/screenshot_12345.png",
"window": {
"id": "12345",
"title": "Chrome - Documentation",
"app": "Google Chrome"
}
},
{
"window_id": "67890",
"success": true,
"file_path": "/tmp/screenshot_67890.png",
"window": {
"id": "67890",
"title": "VS Code",
"app": "Code"
}
}
]
}Capture entire display(s).
Parameters:
display_id(optional, number) - Specific display number (0-indexed), or omit to capture all
Single Display Returns:
{
"success": true,
"file_path": "/tmp/display_0.png",
"display": 0
}All Displays Returns:
{
"success": true,
"captures": [
{
"display": 0,
"file_path": "/tmp/display_0.png"
},
{
"display": 1,
"file_path": "/tmp/display_1.png"
}
]
}Error: Screen Recording permission required
Solution:
- Open System Preferences > Privacy & Security > Screen Recording
- Enable permission for your terminal or application
- Restart the MCP server
Error: Window {id} not found. It may have been closed.
Cause: The window was closed between listing and capturing.
Solution: Call list_windows again to get current window IDs.
Error: Output path must end with .png
Solution: Ensure custom output paths have a .png extension.
Error: Native module compilation errors
Solution:
- Ensure you're on macOS 12.0+
- Verify Node.js version is 16.0.0+
- Try reinstalling:
npm install -g mac-vision-mcp --force
Issue: list_windows returns empty array or missing windows
Cause: Screen Recording permission not granted or windows filtered out
Solution:
- Verify Screen Recording permission is enabled
- Note: System windows and gesture overlays are automatically filtered
- Windows smaller than 50x50 pixels are excluded
- Language: TypeScript/Node.js with ESM modules
- MCP SDK: @modelcontextprotocol/sdk (v1.22.0)
- Screenshot Library: node-screenshots (v0.2.4) with native N-API bindings
- Window Metadata: get-windows (v9.2.3)
- Permissions: mac-screen-capture-permissions (v2.1.0)
- Validation: Zod (v3.25.0)
# Clone repository
git clone https://github.com/jasich/mac-vision-mcp.git
cd mac-vision-mcp
# Install dependencies
npm install
# Build
npm run build
# Run locally
node dist/index.jsTo test your local development build with Claude Code or another MCP client:
-
Build the project (if not already done):
cd /path/to/mac-vision-mcp npm run build -
Configure your other project's
.claude.jsonwith the absolute path:{ "mcpServers": { "mac-vision": { "command": "node", "args": ["/path/to/mac-vision-mcp/dist/index.js"] } } } -
Restart Claude Code to load the local build
-
Make changes and rebuild as needed:
npm run build # Rebuild after code changes
Note: Replace /path/to/mac-vision-mcp with your actual absolute path to the project.
# Run with MCP Inspector for debugging
npx @modelcontextprotocol/inspector node ./dist/index.jsContributions are welcome! Please feel free to submit issues or pull requests.
MIT License - see LICENSE file for details.
- Built on the Model Context Protocol
- Uses node-screenshots for native screenshot capture
- Uses get-windows by Sindre Sorhus for window metadata
- Issues: Report bugs or request features via GitHub Issues
- Documentation: Model Context Protocol Docs
- MCP Inspector: Use for testing and debugging MCP tools