A Swift CLI and MCP server for interacting with the iOS Simulator that's optimized for agentic usage. Tap, swipe, type, screenshot, and read the accessibility tree, with a consistent coordinate space and simplified commands.
iosef start --local --device "X" → creates/boots simulator, saves state to .iosef/state.json
iosef tap / type / view → reads state.json, performs action, exits
iosef mcp → long-running MCP server (stdio), same capabilities
iosef stop → shuts down simulator, deletes device, cleans up state
Each CLI invocation is a short-lived process. The simulator runs independently and persists between commands.
This Swift tool can be installed directly from PyPI using pip or uv.
You can run it without installing it first using uvx:
uvx iosef --helpOr install it, then run iosef --help:
uv tool install iosef
# or
pip install iosefYou can also build from source:
swift build -c release
# or
./scripts/build.sh # builds + installs to ~/.local/bin/iosefCompiled binaries are available on the releases page.
Building from source requires Swift 6.1+, macOS 13+, and Xcode with an iOS simulator runtime installed.
iosef start --local --device "my-sim" # Create + boot simulator, local session
iosef start --device "iPhone 16 Pro" --device-type "iPhone 16 Pro" --runtime "iOS 18.4"
iosef connect "iPhone 16" --local # Associate with an existing simulator
iosef status # Show simulator name, UDID, state, session info
iosef status --json # Machine-readable output
iosef stop # Shut down, delete device, remove sessioniosef describe # Dump the full accessibility tree
iosef describe --depth 2 # Limit tree depth
iosef describe --json | jq '.. | objects | select(.role == "button")'
iosef describe --x 200 --y 400 # What's at this coordinate?
iosef describe --x 200 --y 400 --json | jq '.content[0].text'
iosef view # Screenshot to temp file (prints path)
iosef view --output /tmp/screen.png # Screenshot to specific file
iosef view --output /tmp/screen.jpg --type jpegiosef tap --name "Sign In" # Find element + tap (preferred)
iosef tap --role AXButton --name "Submit"
iosef tap --name "Menu" --duration 0.5 # Long-press by selector
iosef tap --x 200 --y 400 # Tap at coordinates
iosef tap --x 100 --y 300 --duration 0.5 # Long-press at coordinates
iosef type --text "Hello World" # Type into the focused field
iosef type --name "Search" --text "query" # Find + tap + type in one step
iosef type --role AXTextField --text "hello"
iosef swipe --x-start 200 --y-start 600 --x-end 200 --y-end 200
iosef swipe --x-start 200 --y-start 600 --x-end 200 --y-end 200 --duration 0.3Search and query by --role, --name, and --identifier. Multiple criteria combine with AND logic.
iosef find --role AXButton # All buttons
iosef find --name "Sign In" # Elements matching label (substring, case-insensitive)
iosef find --role AXStaticText --name "count" # Combine selectors
iosef exists --name "Sign In" # Prints "true"/"false", exit 0/1
iosef count --role AXButton # How many buttons?
iosef text --name "Tap count" # Extract text content from first match
iosef wait --name "Welcome" # Poll until element appears (default 10s)
iosef wait --role AXButton --name "Continue" --timeout 5iosef install_app --app-path /path/to/MyApp.app # Install .app or .ipa bundle
iosef launch_app --bundle-id com.apple.mobilesafari
iosef launch_app --bundle-id com.example.myapp --terminate-runningiosef log_show --process SpringBoard --last 5s
iosef log_show --predicate 'subsystem == "com.apple.UIKit"' --last 3s
iosef log_show --level debug --last 1m
iosef log_stream --process SpringBoard --duration 3
iosef log_stream --predicate 'process == "MyApp"' --duration 10iosef mcp # Start MCP server on stdioEvery CLI subcommand is also available as an MCP tool. Configure in your MCP client:
{
"mcpServers": {
"iosef": {
"command": "iosef",
"args": ["mcp"]
}
}
}All commands use iOS points. Screenshots are coordinate-aligned: 1 pixel = 1 iOS point.
The accessibility tree reports positions as (center±half-size) — the center value is the tap target. For example, an element at (195±39, 420±22) has its center at (195, 420) and spans from x=156 to x=234 and y=398 to y=442. Use the center values directly with tap --x 195 --y 420.
This means visual agents can tap exactly where they see elements in screenshots, with no coordinate translation layer.
By default, iosef stores state globally in ~/.iosef/. You can instead create a session scoped to the current directory with --local:
iosef start --local --device "my-sim" # State stored in ./.iosef/state.json
iosef tap --name "Sign In" # Auto-detects local session
iosef stop # Cleans up local sessionAuto-detection: When neither --local nor --global is specified, iosef checks for ./.iosef/state.json in the current directory. If found, it uses the local session; otherwise it falls back to the global ~/.iosef/ session.
# Force global even when a local session exists
iosef --global tap --x 200 --y 400
# Force local
iosef --local statusDevice resolution (when --device is omitted):
state.jsondevice field (local session, then global)- VCS root directory name (git or jj)
- Any booted simulator
Add .iosef/ to your .gitignore to keep session state out of version control. The skills/ios-simulator-interaction/ directory contains a workflow guide for Claude Code users.
| Exit code | Meaning |
|---|---|
0 |
Success |
1 |
Check failed (exists returned false) or tool error |
2 |
Bad arguments or usage error |
This makes it easy to distinguish "the check didn't pass" from "the command couldn't run" in scripts.
#!/bin/bash
set -euo pipefail
FAIL=0
check() {
if ! "$@"; then
echo "FAIL: $*"
FAIL=1
fi
}
# Boot and launch
iosef start --local --device "test-sim"
iosef install_app --app-path ./build/MyApp.app
iosef launch_app --bundle-id com.example.myapp
# Wait for the app to load
iosef wait --name "Welcome" --timeout 15
# Log in
iosef type --name "Email" --text "user@example.com"
iosef type --name "Password" --text "hunter2"
iosef tap --name "Sign In"
# Verify we landed on the dashboard
iosef wait --name "Dashboard" --timeout 10
check iosef exists --name "Dashboard"
check iosef exists --role AXButton --name "Settings"
# Take a screenshot for the record
iosef view --output /tmp/dashboard.png
iosef stop
if [ "$FAIL" -ne 0 ]; then
echo "Some checks failed"
exit 1
fi
echo "All checks passed"| Environment Variable | Default | Description |
|---|---|---|
IOSEF_DEFAULT_OUTPUT_DIR |
~/Downloads |
Default directory for screenshots |
IOSEF_TIMEOUT |
— | Override default timeout (seconds) |
IOSEF_FILTERED_TOOLS |
(none) | Comma-separated MCP tool names to hide |
State is stored in ~/.iosef/state.json (global) or ./.iosef/state.json (local). See Directory-scoped sessions.
| Command | Arguments | Description |
|---|---|---|
start |
[--device N] [--device-type T] [--runtime R] [--local|--global] |
Create/boot simulator, set up session |
stop |
Shut down, delete simulator, remove session | |
connect |
<name-or-udid> [--local|--global] |
Associate with an existing simulator |
status |
Show simulator and session status | |
install_app |
--app-path <path> |
Install .app or .ipa bundle |
launch_app |
--bundle-id <id> [--terminate-running] |
Launch app by bundle identifier |
describe |
[--depth N] [--x X --y Y] |
Describe accessibility tree or element at point |
view |
[--output <path>] [--type png|jpeg|tiff|bmp|gif] |
Capture screenshot |
tap |
[--role R] [--name N] [--identifier I] [--x X --y Y] [--duration S] |
Tap by selector or at coordinates |
type |
--text <text> [--role R] [--name N] [--identifier I] |
Type text; with selectors: find + tap + type |
swipe |
--x-start X --y-start Y --x-end X --y-end Y [--delta N] [--duration S] |
Swipe between two points |
find |
[--role R] [--name N] [--identifier I] |
Find elements by selector |
exists |
[--role R] [--name N] [--identifier I] |
Check if element exists (exit 1 if not) |
count |
[--role R] [--name N] [--identifier I] |
Count matching elements |
text |
[--role R] [--name N] [--identifier I] |
Extract text from first match |
wait |
[--role R] [--name N] [--identifier I] [--timeout S] |
Wait for element to appear |
log_show |
[--last T] [--process P|--predicate P] [--style S] [--level L] |
Show recent log entries |
log_stream |
[--duration S] [--process P|--predicate P] [--style S] [--level L] |
Stream live log entries |
mcp |
Start MCP server (stdio transport) |
| Flag | Description |
|---|---|
--device <name-or-udid> |
Target simulator (auto-detected if omitted) |
--local |
Use directory-scoped session (./.iosef/) |
--global |
Use global session (~/.iosef/) |
--verbose |
Enable diagnostic logging to stderr |
--json |
Output results as JSON |
--version |
Print version and exit |
-h, --help |
Show help |
This project draws inspiration from:
- facebook/idb — Meta's iOS simulator CLI, where most of the simulator implementation ideas came from. In comparison, iosef trades off performance for a simpler interface (scaling screenshots to match iOS point space; targeting taps via accessibility labels) and deployment model (no companion process).
- simonw/rodney — Simon Willison's Chrome interaction CLI, where most of the interface ideas came from (subcommand names, session model, documentation structure). Iosef is arguably just "rodney, but for iOS".
- joshuayoes/ios-simulator-mcp — the original iOS Simulator MCP server. Got me interested in this space — iosef actually started out as an attempt to port this to swift for performance. Modern agents have a better time using CLI tools, though, leading to a shift in focus.
- ldomaradzki/xctree — another useful reference for working with the simulator's accessibility tree.
MIT — see LICENSE for details.