OpenSafari

Smart. Fast. Parallel.
iOS Safari automation MCP server via Xcode Simulator.

Headless mobile QA automation — drive real Safari, Flutter, and native iOS apps on Xcode Simulator without stealing your mouse or requiring Simulator.app focus.

How OpenSafari compares

	OpenSafari	Playwright WebKit	BrowserStack	Manual Testing
Engine	Real Safari (Xcode Sim)	Bundled WebKit (approximation)	Real devices (cloud)	Real devices
Protocol	WebKit Remote Debugging (direct)	Playwright API (wrapper)	Proprietary	N/A
iOS Fidelity	exact	close but diverges	exact	exact
Parallel sessions	N simulators	N browsers	limited by plan	1 device
Login persistence	built-in (real Safari cookies)	manual	manual	manual
LLM integration	MCP native	none	none	none
Cost	free (Xcode)	free	$29+/mo	device cost
iOS-specific QA	auto-detect (zoom, safe area, keyboard)	none	manual	manual

tl;dr — OpenSafari controls the real Safari inside Xcode Simulator via WebKit Remote Debugging Protocol — the same way OpenChrome controls real Chrome via CDP. No middleware, no bundled browsers. Just direct protocol access to the actual Safari.app.

Headless Capabilities

OpenSafari runs fully headless on CI — no display server, no mouse focus, no Simulator.app window required. See docs/headless-architecture.md for the full technical design.

Scenario	Query (AX Tree)	Input (Tap/Type)	Headless	Backend
Safari (Web)	✅	✅	✅	WebKit Remote Debug
Flutter App	✅	✅	✅	FlutterVMInputBackend
Native iOS App (Xcode ≤ 16)	✅	✅	✅	SimulatorKitHID (Tier 1) / `simctl`
Native iOS App (Xcode 26+)	✅	⚠️ Partial	⚠️	Keys/buttons headless via SimHID; tap/swipe pending — see #491
WebView in Native	✅	⚠️ Partial	⚠️	WebKit + Native

✅ Supported and stable. ⚠️ Partially supported — see linked docs for current status and limitations.

Headless input vs other iOS automation tools

	OpenSafari	Appium	idb	XCUITest
Headless native input (no mouse focus, no `Simulator.app` activation)	⚠️ SimulatorKit HID (keys/buttons on Xcode 26+; tap/swipe pending #491)	❌ (XCUI focus)	✅ `FBSimulatorHID`	❌
Works on Xcode 26+ (after `simctl io input` removal)	⚠️ Safari & Flutter fully; native tap/swipe pending	⚠️ driver-dependent	✅	✅
Flutter native taps (no OS-level input)	✅ Dart VM `PointerDataPacket`	⚠️ 3rd-party plugin	❌	❌
MCP / LLM integration	✅ native	❌	❌	❌
Private API dependency	SimulatorKit (documented, sentinel-guarded)	UIAutomation / XCUI	SimulatorKit	none

See docs/private-apis.md for the SimulatorKit contract, the daily sentinel CI that detects BC breaks, and the rollback plan if Apple changes symbols.

For CI setup recipes (GitHub Actions, Buildkite, GitLab CI), see docs/ci-recipes.md.

What is OpenSafari?

Imagine testing your e-commerce site on iPhone 17e, iPhone 17, iPhone 17 Pro Max, and iPad — all at the same time, already logged in, with an AI agent that automatically finds iOS-specific bugs. That's OpenSafari.

You: Check our checkout flow for mobile issues across all iPhone sizes

AI:  [4 parallel simulators, all devices simultaneously]
     iPhone 17e:      ⚠ Credit card input triggers iOS auto-zoom (font-size: 14px)
     iPhone 17:       ✓ Layout OK
     iPhone 17 PM:    ⚠ "Place Order" button only 38×32px (below 44px touch target)
     iPad:            ⚠ Shipping form hidden behind keyboard when focused

     Time: 8s | All screenshots captured and analyzed.

	Manual QA	OpenSafari
4-device test	~30 min	~10s (parallel)
Login	Each device, each time	Never (persisted)
iOS bug detection	Human eye	Automatic (LLM vision)
Consistency	Varies by tester	Deterministic

Core Architecture

OpenSafari follows the same direct-protocol philosophy as OpenChrome:

OpenChrome:  CDPClient → Chrome DevTools Protocol → Real Chrome
OpenSafari:  SafariClient → WebKit Remote Debugging Protocol → Real Safari in Simulator

No middleware. No bundled browsers. Direct connection.

Claude Code / AI Agent (MCP Client)
    │
    │  JSON-RPC (stdio / HTTP)
    ▼
┌─────────────────────────────────────┐
│         OpenSafari MCP Server       │
│                                     │
│  ┌─────────────┐  ┌──────────────┐  │
│  │ Simulator   │  │ Safari       │  │
│  │ Manager     │  │ Client       │  │
│  │ (simctl)    │  │ (WebKit      │  │
│  │             │  │  Protocol)   │  │
│  └──────┬──────┘  └──────┬───────┘  │
│         │                │          │
│    boot/shutdown    navigate/click   │
│    rotate/appear    screenshot       │
│    multi-device     DOM/JS/cookies   │
│         │                │          │
│  ┌──────▼────────────────▼───────┐  │
│  │     Xcode Simulator(s)       │  │
│  │  ┌────────┐  ┌────────┐      │  │
│  │  │ iPhone │  │ iPhone │ ...  │  │
│  │  │ SE     │  │ 16 PM  │      │  │
│  │  │ Safari │  │ Safari │      │  │
│  │  └────────┘  └────────┘      │  │
│  └───────────────────────────────┘  │
│                                     │
│  ┌─────────────────────────────┐    │
│  │  Shared Infrastructure      │    │
│  │  (from OpenChrome)          │    │
│  │  • Security (sanitizer,     │    │
│  │    domain guard, audit)     │    │
│  │  • Watchdog (event loop,    │    │
│  │    disk, health endpoint)   │    │
│  │  • Orchestration (workflow  │    │
│  │    engine, parallel workers)│    │
│  │  • Session persistence      │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘

Key Features

1. Real Safari, Real Bugs

OpenSafari controls the actual Safari.app inside Xcode Simulator via WebKit Remote Debugging Protocol — not a bundled approximation. Every iOS-specific quirk is faithfully reproduced:

iOS auto-zoom on inputs with font-size < 16px
position: fixed behavior with real software keyboard
100vh viewport height inconsistencies with address bar
Safe area insets (notch, home indicator) on real device frames
color-scheme dark mode forced rendering
Touch target minimum size requirements (44×44px)

2. Parallel Multi-Device Testing

Test across multiple devices simultaneously with a single command:

opensafari serve --devices "iphone-17e,iphone-17,iphone-17-pro-max,ipad-pro"

# 4 simulators boot in parallel
# Each gets its own Safari instance + WebKit Protocol connection
# All share login state via cookie injection

3. Persistent Login Sessions

Log in once, test forever. Cookies and localStorage are extracted directly from the real Safari session:

# First time: login captured from real Safari
opensafari auth save --site myapp.com
  → Exports cookies via WebKit Protocol Network.getAllCookies
  → Saves to ~/.opensafari/auth/myapp.json

# Every subsequent run: auto-restored
opensafari serve
  → Injects cookies via Network.setCookie into each simulator's Safari
  → All simulators start already logged in

4. iOS-Specific Auto-Detection

Built-in QA checks that run on real Safari — no approximation:

Check	What It Detects
Auto-Zoom Guard	`<input>` elements with font-size < 16px
Safe Area Validator	Content hidden behind notch or home indicator
Keyboard Overlap	Fixed elements covered by real software keyboard
Touch Target Audit	Clickable elements smaller than 44×44px
Dark Mode Diff	Visual differences via `simctl ui appearance` toggle
Viewport Compare	Layout breaks across different screen sizes
Scroll Lock Check	Body scroll not restored after modal close

5. MCP Native

Works with any MCP client — Claude Code, Cursor, VS Code, or custom agents:

// .mcp.json
{
  "mcpServers": {
    "opensafari": {
      "command": "npx",
      "args": ["-y", "opensafari-mcp", "serve"]
    }
  }
}

6. Shared DNA with OpenChrome

OpenSafari shares battle-tested infrastructure with OpenChrome:

Module	Source	Status
MCP Server core	OpenChrome	Shared
Transport (stdio/HTTP)	OpenChrome	Shared
Security (sanitizer, guard, audit)	OpenChrome	Shared
Watchdog (event loop, disk, health)	OpenChrome	Shared
Orchestration (workflow engine)	OpenChrome	Adapted
Simulator Manager	NEW	OpenSafari
Safari Client (WebKit Protocol)	NEW	OpenSafari
iOS QA Engine	NEW	OpenSafari

Tools

Core Tools (Tier 1)

Tool	Description
`navigate`	Open URL in real Safari
`click`	Tap element by CSS selector or coordinates
`type`	Type text into form elements
`scroll`	Scroll page in any direction
`screenshot`	Capture real Safari screen via WebKit Protocol
`read_page`	Extract visible text content
`query_dom`	CSS selector queries with element details
`javascript`	Execute JavaScript in page context via `Runtime.evaluate`
`inspect`	Element CSS, accessibility, and layout inspection
`cookies`	Get/set/clear real Safari cookies via `Network` domain

Device Management (Tier 1)

Tool	Description
`device_list`	List available simulator device types
`device_boot`	Boot a specific device (iPhone SE, 16, iPad, etc.)
`device_shutdown`	Shutdown simulator
`device_rotate`	Toggle portrait/landscape
`appearance_toggle`	Switch light/dark mode via `simctl ui`

App Lifecycle (Tier 2)

Tool	Description
`app_launch`	Launch app by bundle ID with optional args and env vars
`app_terminate`	Terminate a running app by bundle ID
`app_activate`	Bring app to foreground (launches if not running)
`app_list_running`	List running foreground apps with PIDs
`app_reset`	Reset app state: terminate, clear permissions, uninstall

Auth Tools (Tier 3)

Tool	Description
`auth_save`	Capture cookies + localStorage from current session
`auth_restore`	Restore saved auth state into a simulator
`auth_list`	List saved auth profiles

Parallel & Orchestration (Tier 2)

Tool	Description
`batch_screenshot`	Capture same URL across all active devices
`batch_execute`	Run JS across all simulators in parallel
`batch_navigate`	Open same URL on all devices simultaneously
`cross_viewport_compare`	Side-by-side visual comparison across devices

iOS QA Engine (Tier 3)

Tool	Description
`qa_auto_zoom`	Detect inputs triggering iOS auto-zoom
`qa_touch_targets`	Find elements below 44×44px minimum
`qa_safe_area`	Check content behind notch/home indicator
`qa_keyboard_overlap`	Detect fixed elements hidden by real keyboard
`qa_dark_mode`	Compare light vs dark mode rendering
`qa_full_audit`	Run all QA checks and generate report

Flutter QA Workflow

OpenSafari drives Flutter apps running in the iOS Simulator using the same native-app tools used for UIKit/SwiftUI. The only wrinkle is that Flutter's accessibility tree is lazy — it only populates when an assistive technology (VoiceOver, XCTest) connects. OpenSafari auto-activates it so widget labels are queryable.

Auto-activation (no app changes required)

app_tree, app_query, app_inspect, app_tap_element, app_wait_for, and app_assert_element all call ensureSemanticsActive() before reading the accessibility tree. The activator:

Quick-checks the tree — if it already has ≥5 nodes, semantics is already active.
Toggles com.apple.Accessibility.AccessibilityEnabled via xcrun simctl spawn defaults write — this triggers Flutter's SemanticsBinding without enabling VoiceOver spoken feedback.
Polls the tree (up to 3s) until it populates, or falls back gracefully if activation never succeeds.

Recommended workflow

// 1. Boot simulator and launch the Flutter app
await device_boot('iPhone 16');
await app_launch({ bundleId: 'com.example.flutterApp' });

// 2. Terminate Safari if it's running — its background elements can
//    dominate the macOS AX tree and hide Flutter widgets.
await app_terminate({ bundleId: 'com.apple.mobilesafari' });
await app_switch_app({ bundleId: 'com.example.flutterApp' });

// 3. Read the tree — Flutter Semantics nodes appear automatically.
const tree = await app_tree({ max_depth: 5 });

// 4. Query widgets by label, identifier, or role.
const button = await app_query({ label: 'Login' });
const email  = await app_query({ identifier: 'email-field' });

Making widgets queryable

Flutter API	Queryable via	Notes
`Semantics(label: 'Login')`	`app_query({ label: 'Login' })`	Preferred for human-readable labels
`Semantics(identifier: 'login-btn')`	`app_query({ identifier: 'login-btn' })`	Flutter 3.19+ — stable selector for tests
Plain `Text('Counter: 42')`	`app_query({ text: 'Counter' })`	Works when the text widget auto-synthesizes semantics
`Key('login-btn')`	Not queryable	Keys are a Flutter-internal reference — they do not surface to the native AX tree

Release builds

Approach A (simctl defaults write) works for both debug and release builds. If a release app still reports an empty tree, add the explicit opt-in to main.dart:

void main() {
  WidgetsFlutterBinding.ensureInitialized();
  SemanticsBinding.instance.ensureSemantics(); // enable for OpenSafari QA
  runApp(const MyApp());
}

For debug/profile builds, flutter_connect + flutter_widget_tree additionally expose the Dart VM Service, which returns the full Flutter widget hierarchy (including render-tree nodes that never reach the native AX bridge).

See docs/troubleshooting.md for common failure modes (empty trees, missing labels, Safari shadowing).

Quick Start

# Install
npm install -g opensafari-mcp

# Run (stdio mode — for MCP clients like Claude Code)
opensafari serve

# HTTP mode
opensafari serve --http 3100

# With all tool tiers exposed
opensafari serve --all-tools

# With specific devices auto-booted
opensafari serve --devices "iphone-17e,iphone-17-pro-max"

# With auth state
opensafari serve --auth ~/.opensafari/auth/mysite.json

MCP Client Configuration

// Claude Code: .mcp.json
{
  "mcpServers": {
    "opensafari": {
      "command": "npx",
      "args": ["-y", "opensafari-mcp", "serve"]
    }
  }
}

// Claude Desktop: claude_desktop_config.json
{
  "mcpServers": {
    "opensafari": {
      "command": "npx",
      "args": ["-y", "opensafari-mcp", "serve", "--all-tools"]
    }
  }
}

Tool Tiers

Tools are organized into 3 tiers for progressive disclosure:

Tier	Tools	Access
Tier 1	navigate, screenshot, click, type, scroll, read_page, query_dom, javascript, cookies, device_boot, device_shutdown, device_list	Default
Tier 2	inspect, wait_for, press, swipe, long_press, batch_navigate, batch_screenshot, cross_viewport_compare	`setTier(2)`
Tier 3	auth_save, auth_restore, auth_list, qa_audit, qa_* detectors, workflow_init, appearance_toggle	`--all-tools`

Programmatic API

import { createServer } from 'opensafari-mcp';

// Create and start the MCP server
const server = createServer({
  tier: 3,          // expose all tool tiers
  auditLog: true,   // enable tool call logging
});

// Start with stdio transport (default)
await server.start();

// Or start with HTTP transport
await server.start({ transport: 'http', port: 3100 });

WebKitClient

Direct WebKit protocol access for custom automation:

import { WebKitClient } from 'opensafari-mcp';

const client = new WebKitClient({ host: 'localhost', port: 9322 });
await client.connect({ retries: 5, retryDelay: 2000 });

// Navigate and evaluate
await client.navigate({ url: 'https://example.com', waitUntil: 'load' });
const title = await client.evaluate<string>('document.title');

// Screenshot (returns PNG buffer)
const png = await client.screenshot();

// Cookies
const cookies = await client.getCookies();
await client.setCookies([{ name: 'key', value: 'val', domain: '.example.com',
  path: '/', expires: -1, httpOnly: false, secure: false }]);

// DOM interaction
await client.click('#submit-btn');
await client.type('#email-input', 'user@example.com');

await client.disconnect();

SimulatorManager

Programmatic simulator lifecycle control:

import { SimulatorManager } from 'opensafari-mcp';

const manager = new SimulatorManager();

// Boot a device
const device = await manager.boot('iPhone 17 Pro');
console.log(device.udid, device.state); // "XXXX-..." "Booted"

// Open Safari
await manager.openUrl(device.udid, 'https://example.com');

// List booted devices
const booted = await manager.listBooted();

// Shutdown
await manager.shutdown(device.udid);

Requirements

macOS (Xcode Simulator is macOS only)
Xcode with iOS Simulator runtime installed
Node.js >= 18
ios-webkit-debug-proxy — brew install ios-webkit-debug-proxy

WebInspector Proxy Configuration

OpenSafari uses ios_webkit_debug_proxy to bridge WebKit Remote Debugging from Xcode Simulator. The proxy is auto-started by the device_boot tool — no manual setup is needed in most cases.

Default Ports

Port	Purpose
9321	Device list (HTML) — serves the proxy's device listing page
9322	Device connection (JSON) — WebKit debugging targets for connected simulators

Port 9322 is deliberately offset from Chrome DevTools (9222) so OpenSafari and OpenChrome can run simultaneously.

Custom Port

Set the OPENSAFARI_PROXY_PORT environment variable to use a different device port:

# Use port 9500 instead of the default 9322
OPENSAFARI_PROXY_PORT=9500 opensafari serve

Port resolution order:

Explicit port option (programmatic use)
OPENSAFARI_PROXY_PORT environment variable
Default: 9322

Multi-Session Usage

Multiple Claude Code sessions can share the same proxy. When a session detects a healthy proxy already running on its target port, it reuses it instead of starting a new one. When the owning session exits, only its own proxy is terminated — other sessions' proxies remain unaffected.

Input Backend Selection

OpenSafari dispatches native input (app_tap, app_swipe_native, app_scroll_native, app_double_tap, app_type_text, app_key_input) through a 5-tier fallback chain and surfaces the selected path in each tool result via a backend field.

Tier	Backend	Identifier	Headless?	When used
0	`FlutterVMInputBackend`	`flutter-vm`	Yes	Flutter app reachable over Dart VM Service + DDS
1	`SimulatorKitHIDInputBackend`	`simhid`	Yes	Any app — `sim-hid-bridge` resolves and `SimulatorKit.framework` loads (covers Xcode 26+)
2	`SimctlInputBackend`	`simctl`	Yes	Xcode ≤16 legacy path (where `simctl io input` is still available)
3	`WebKitInputBackend`	`webkit`	Yes	Xcode 26+ with an active Safari / WebView connection
4	`AppleScriptInputBackend`	`applescript`	No	Opt-in only — moves the mouse cursor and activates Simulator.app

See docs/headless-architecture.md for the decision flowchart and the full scenario matrix. Tool responses also include _meta: { backendKind, headless, deviceId } so CI can assert _meta.headless === true.

Example tool result:

{
  "status": "tapped",
  "x": 100,
  "y": 200,
  "deviceId": "…",
  "backend": "simhid",
  "_meta": { "backendKind": "simhid", "headless": true, "deviceId": "…" }
}

Focus-theft protection (`OPENSAFARI_ALLOW_FOCUS_INPUT`)

Tier 3 is default-deny. On Xcode 26+ with no Safari connection, getInputBackend() throws HeadlessInputUnavailableError instead of silently moving the physical mouse cursor. To re-enable the legacy AppleScript/CGEvent fallback (for example, to automate a non-Safari native app), opt in with an environment variable:

# Only set this if you understand the consequences — it WILL move your
# mouse cursor and bring Simulator.app to the foreground.
OPENSAFARI_ALLOW_FOCUS_INPUT=1 opensafari serve

Accepted values are 1 and true; anything else is ignored. When the opt-in is honored, a one-time warning is logged to stderr at the first tool call.

Relationship to OpenChrome

OpenSafari is the Safari/iOS counterpart to OpenChrome. Same philosophy, same architecture — different browser.

	OpenChrome	OpenSafari
Browser	Real Chrome	Real Safari (in Simulator)
Protocol	Chrome DevTools Protocol (CDP)	WebKit Remote Debugging Protocol
Client	CDPClient (puppeteer-core)	SafariClient (WebKit Protocol)
Execution	`chrome --remote-debugging-port`	`xcrun simctl` + WebKit debug socket
Use Case	Desktop web automation	Mobile web QA & debugging
Parallel	N tabs in 1 Chrome	N simulators, each with Safari
Login	Real Chrome sessions	Real Safari sessions

Together, they provide complete browser coverage — Chrome for desktop, Safari for iOS — both controlled by AI agents through MCP with direct protocol connections. No middleware, no bundled browsers.

Documentation

Document	Description
Getting Started	Setup guide and first steps
API Reference	Programmatic API documentation
Troubleshooting	Common issues and solutions
CI Integration	Using OpenSafari in CI pipelines
CI Recipes	Copy-paste GitHub Actions, Buildkite, and GitLab CI recipes
RFC: Native App Backend	Architecture RFC for native-app automation in Xcode Simulator
Native App Tool Surface	Proposed MCP tool surface for native-app automation
WebKit Protocol Research	WebKit Remote Debugging Protocol research notes

License

MIT

Built for developers who ship mobile-first products.
_{By the creators of OpenChrome}

Name		Name	Last commit message	Last commit date
Latest commit History 817 Commits
.claude/commands		.claude/commands
.github/workflows		.github/workflows
.test-auth-273		.test-auth-273
cli		cli
docs		docs
scripts		scripts
spike		spike
src		src
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.npmignore		.npmignore
.nvmrc		.nvmrc
.prettierrc		.prettierrc
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
e2e-275-confirmation.png		e2e-275-confirmation.png
e2e-275-rotated.png		e2e-275-rotated.png
eslint.config.js		eslint.config.js
jest.ci.config.js		jest.ci.config.js
jest.config.js		jest.config.js
jest.integration.config.js		jest.integration.config.js
jest.sentinel.config.js		jest.sentinel.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.cli.json		tsconfig.cli.json
tsconfig.json		tsconfig.json
tsconfig.test.json		tsconfig.test.json
webpack.config.js		webpack.config.js

Folders and files

Latest commit

History

Repository files navigation

OpenSafari

How OpenSafari compares

Headless Capabilities

Headless input vs other iOS automation tools

What is OpenSafari?

Core Architecture

Key Features

1. Real Safari, Real Bugs

2. Parallel Multi-Device Testing

3. Persistent Login Sessions

4. iOS-Specific Auto-Detection

5. MCP Native

6. Shared DNA with OpenChrome

Tools

Core Tools (Tier 1)

Device Management (Tier 1)

App Lifecycle (Tier 2)

Auth Tools (Tier 3)

Parallel & Orchestration (Tier 2)

iOS QA Engine (Tier 3)

Flutter QA Workflow

Auto-activation (no app changes required)

Recommended workflow

Making widgets queryable

Release builds

Quick Start

MCP Client Configuration

Tool Tiers

Programmatic API

WebKitClient

SimulatorManager

Requirements

WebInspector Proxy Configuration

Default Ports

Custom Port

Multi-Session Usage

Input Backend Selection

Focus-theft protection (OPENSAFARI_ALLOW_FOCUS_INPUT)

Relationship to OpenChrome

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 27

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Focus-theft protection (`OPENSAFARI_ALLOW_FOCUS_INPUT`)

Packages