Skip to content

feat: Chrome Extension + micro-daemon for browser automation#85

Merged
jackwener merged 10 commits intomainfrom
extension
Mar 19, 2026
Merged

feat: Chrome Extension + micro-daemon for browser automation#85
jackwener merged 10 commits intomainfrom
extension

Conversation

@jackwener
Copy link
Owner

Summary

Replace @playwright/mcp with a lightweight Chrome Extension + micro-daemon architecture.

Architecture

CLI → daemon (HTTP, port 19825) → WebSocket → Chrome Extension → chrome.debugger CDP → page

What's included

  • Micro-daemon (src/daemon.ts): HTTP + WebSocket bridge, auto-start on cold boot, idle auto-exit
  • Chrome MV3 Extension (extension/): service worker connects to daemon via WebSocket, executes JS via chrome.debugger CDP Runtime.evaluate
  • Page API (src/browser/page.ts): navigate, evaluate, cookies, click, type, wait, snapshot

Tested end-to-end

  • zhihu hot — 14.3s
  • twitter timeline — 9.3s

Key design decisions

  • chrome.debugger over chrome.scripting: no host permission grants needed
  • ESM module service worker (matches bb-browser pattern)
  • IIFE wrapping for function expressions sent to Runtime.evaluate
  • Tab ID tracking across navigate/exec calls

@jackwener jackwener force-pushed the extension branch 2 times, most recently from 9651a3b to 7cf473d Compare March 19, 2026 08:16
Architecture:
- Micro-daemon (HTTP + WebSocket bridge, ~190 lines, auto-start/idle-exit)
- Chrome MV3 Extension using chrome.debugger CDP (10KB build)
- 5 protocol actions: exec, navigate, tabs, cookies, screenshot
- All DOM ops via JS evaluate — no extension update needed for new features

Key features:
- CDP Runtime.evaluate for JS execution in page context
- Tab management, cookie access via Chrome APIs
- Auto-start daemon on cold boot, idle auto-exit (5min)
- Minimal permissions: debugger, tabs, cookies, activeTab, alarms

Tested: zhihu hot (14.3s), twitter timeline (9.3s)
P0 Critical:
- #1 Fix double IIFE wrapping: unified wrapForEval() replaces
  normalizeEvaluateSource + ad-hoc wrap in page.evaluate()
- #2 Fix navigate race: check tab.status before addListener,
  reduced timeout 30s→15s

P1 Should Fix:
- #8 Remove unused permissions (scripting, host_permissions, content_scripts)
- #10 Add retry (3x, 500ms) + timeout (30s) to sendCommand()

P2 Cleanup:
- #3 Extract isWebUrl() to safely handle undefined tab.url
- #4 Sanitize maxDepth with Math.max/min bounds
- #6 Delete empty src/daemon/ directory
- #7 Remove dead createJsonRpcRequest + its test
- #9 Remove stale IIFE-mode comment
- #11 Validate body.id in daemon request handler
- #12 Guard ensureAttached: detach+re-attach on 'already attached'
- #14 Extract _tabOpt() helper (removes 13x spread duplication)
- #15 Add verbose warning for unsupported consoleMessages()

All 35 unit tests pass.
Exponential backoff:
- Reconnect delay: 2s, 4s, 8s, 16s, ..., capped at 60s
- Resets to base delay on successful connection
- Reduces idle CPU waste vs fixed 3s reconnect

Screenshot via CDP Page.captureScreenshot:
- New 'screenshot' action in protocol (5th action)
- Supports format (png/jpeg), quality, fullPage
- Full-page: uses Emulation.setDeviceMetricsOverride for scroll height
- CLI-side: page.screenshot() with optional file save
- Extension build: 9.81KB (+1.7KB from 8.11KB)

Inspired by bb-browser's architecture patterns.
Extension side:
- Hook console.log/warn/error → forward via WS as { type: 'log', level, msg, ts }
- Original console output preserved (for chrome://extensions debug)

Daemon side:
- Ring buffer (200 entries) stores extension logs
- Logs printed to daemon stderr with emoji prefix (📋/⚠️/❌)
- GET /logs — returns buffered logs (optional ?level= filter)
- DELETE /logs — clears log buffer

Usage:
  curl localhost:19825/logs              # view all logs
  curl localhost:19825/logs?level=error  # errors only
  curl -X DELETE localhost:19825/logs    # clear buffer

Extension build: 10.48KB
Bug fixes:
- #1 /logs?level=error returned 404 — use pathname for route matching
- #2 Duplicate initialization — added 'initialized' guard flag

Should fix:
- #4 Added screenshot() to IPage interface
- #5 Graceful shutdown rejects pending requests before exit
- #6 Use process.execPath instead of 'npx tsx' for faster daemon spawn

Cleanup:
- #7 Removed duplicate 'browser' keyword in package.json
- #8 Removed unused normalizeEvaluateSource import from browser.ts
- #9 Changed dynamic import to static import in intercept.ts
- #10 Added explicit throw at end of sendCommand for clarity

61 tests pass (4 test files). Extension: 10.55KB.
tsc --noEmit failed because createMockPage() was missing the
screenshot() method added to IPage in the round 2 review fix.
…chitecture

- Replace all Playwright MCP Bridge references with opencli Browser Bridge
- Remove token setup, MCP config, and manual setup sections
- Simplify prerequisites: just install extension, zero config
- Update troubleshooting: daemon status/logs commands
- Update env vars: add OPENCLI_DAEMON_PORT, OPENCLI_VERBOSE
- Update SKILL.md tags: mcp,playwright → chrome-extension,cdp
Updated 6 files:
- CDP.md, CDP.zh-CN.md: Browser Bridge instead of Playwright MCP Bridge
- CLI-ELECTRON.md: Browser Bridge / IPage abstraction wording
- CLI-EXPLORER.md: browser tools instead of Playwright MCP tools
- TESTING.md: Browser Bridge extension mode, removed token references
- src/clis/chatgpt/README{,.zh-CN}.md: CDP instead of Playwright

Zero Playwright references remaining across all .md files.
…eeded

- Remove 'Chrome Web Store' references (not published yet)
- Add detailed unpacked extension install steps (chrome://extensions)
- Remove 'restart Chrome' advice (Service Worker activates immediately)
- Direct users to chrome://extensions for troubleshooting
@jackwener jackwener merged commit 8bb03ec into main Mar 19, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant