Before any action, route the task to the appropriate workflow:
-
Check task type by looking for these keywords:
- 设计/创意/新功能/brainstorm → Use brainstorming workflow
- 调试/bug/error/debug → Use debugging workflow
- 并发/并行/agent team/dispatch → Use parallel dispatch workflow
- 长任务/harness/multi-step → Use harness workflow
- 实现/implement/开发 → Use implementation workflow
- 分析/研究/调查 → Use analysis workflow
-
Invoke
aios-workflow-routerskill to get the correct workflow. -
Do NOT skip this step - Even simple tasks benefit from proper routing.
This repository is a local-first AI agent workspace centered on browser automation via MCP.
mcp-server/: TypeScript Playwright MCP server (legacy/compat runtime code).scripts/run-browser-use-mcp.sh: default MCP launcher that bridges to/Users/molei/codes/ai-browser-book/mcp-browser-use.scripts/browser-use-bootstrap.py: browser-use bootstrap with optional module shims.mcp-server/src/index.ts: MCP server entry point and tool routing.mcp-server/src/browser/: browser launcher, profile manager, auth checks, and tool actions.config/: runtime configuration such asbrowser-profiles.jsonand safety-related settings.memory/: JSON knowledge/skills/specs used by agent workflows.docs/plans/: implementation and design plans.tasks/: task templates and execution tracking artifacts.
Do not manually edit mcp-server/dist/; it is generated output.
Run commands from mcp-server/ unless noted:
npm install: install dependencies.npm run dev: run MCP server from TypeScript withtsx.npm run typecheck: strict TS type validation (tsc --noEmit).npm run build: compile todist/.npm run start: run built server (node dist/index.js).
Typical local flow:
cd mcp-server
npm install
npm run typecheck && npm run build- Language: TypeScript (ESM, strict mode).
- Indentation: 2 spaces; keep semicolons.
- File names: kebab-case for action modules (for example
auth-check.ts),index.tsfor module entry points. - For
mcp-serverinternals, tool names followbrowser_*. For default runtime (browser-use), usechrome.launch_cdp/browser.connect_cdp/page.*. - Keep configuration JSON keys stable; prefer additive changes over renaming.
- Repo-local discoverable skills must live under
.codex/skills/or.claude/skills/(optionally.agents/skills/only when the target client actually supports it). Do not invent parallel skill roots such as.baoyu-skills/*/SKILL.md; those are non-discoverable and should be plain docs or extension config only.
Automated suites are available for both root AIOS workflows and mcp-server.
Minimum verification for behavior changes:
npm run test:scripts(repo root)cd mcp-server && npm run typecheck && npm run test && npm run build- Manual MCP smoke test (
chrome.launch_cdp->browser.connect_cdp->page.goto->page.screenshot->browser.close) when browser-flow behavior changes
Document manual test steps in PRs when behavior changes.
Git history follows Conventional Commit style:
feat: ...,fix: ...,docs: ...,chore: ...- Optional scope is common (for example
feat(skills): ...).
PRs should include:
- concise problem/solution summary,
- affected paths,
- verification evidence (command output or checklist),
- screenshots/log snippets for browser-flow changes,
- linked task/issue when applicable.
- Never commit credentials, cookies, or personal browser profile data.
- Prefer CDP-based profile config in
config/browser-profiles.jsonfor stable login reuse. - Preserve human-in-the-loop checks for auth walls and sensitive outbound actions.
- In this repo, prefer the
puppeteer-stealthMCP server alias that now routes to browser-use MCP (scripts/run-browser-use-mcp.sh). - For interactive browser work, use
chrome.launch_cdp {"port":9222,"user_data_dir":"~/.chrome-cdp-profile"}thenbrowser.connect_cdp. - If multiple browser MCPs are available, do not use
chrome-devtoolsfor normal business flows; reserve it for low-level inspection/debugging only. - Default reasoning order for page understanding:
page.extract_text/page.get_htmlfirst,page.screenshotas visual fallback.
For substantial user requests, use this route by default:
- Select process skill before coding:
- Design/new behavior:
superpowers:brainstorming - Multi-step delivery:
superpowers:writing-plans - Debug/failure analysis:
superpowers:systematic-debugging
- Design/new behavior:
- Create a plan artifact in
docs/plans/YYYY-MM-DD-<topic>.md. - Apply long-running controls with
aios-long-running-harness:- Lock objective, budgets, stop conditions, and required evidence.
- Persist progress through ContextDB lifecycle (
init -> session -> event -> checkpoint -> context:pack).
- Choose execution mode:
- 2+ independent problem domains: use
superpowers:dispatching-parallel-agents. - Shared-state or coupled changes: execute sequentially.
- If real subagents are unavailable in the current runtime, emulate dispatch by splitting domain tasks explicitly and running only safe independent reads/checks in parallel.
- For repeated multi-agent deliveries, prefer the reusable blueprints in
memory/specs/orchestrator-blueprints.jsonand the shared handoff schema before merging parallel outputs.
- 2+ independent problem domains: use
- Finish with
superpowers:verification-before-completion; do not claim success without checkpoint + artifact evidence.
For long tasks, announce the chosen route in the first progress update.
capis a repository shortcut forcommit + push.- Trigger: when the user message is exactly
cap, execute this flow in the current repo. - Required flow:
git status --shortand confirm there are changes.- If behavior/commands/workflow changed, sync impacted skill docs first (keep
.codex/skills/*and.claude/skills/*aligned). git add -A.- Commit with a Conventional Commit message from current task context.
- If no clear message is available, use fallback
chore: cap snapshot <YYYY-MM-DD>. git push(or set upstream once when required).
- If there are no changes, report a no-op instead of creating an empty commit.
AIOS native enhancements are active in this repository.
Use repo-local skills, agents, and bootstrap docs before falling back to ad-hoc behavior.
ContextDB remains the shared runtime layer for memory, checkpoints, and execution evidence.
Browser MCP is available through the repo-local AIOS server and should be preferred for browser work.
For browser tasks, use this operating pattern unless the user explicitly asks otherwise:
- Connect to a visible CDP browser first:
chrome.launch_cdpthenbrowser.connect_cdp. - On dense or dynamic pages, prefer
page.semantic_snapshotfirst for compact headings/actions before choosing the next step. - Before acting, read the page state with
page.extract_text; usepage.get_htmlonly when text is insufficient. - Work in short read -> act -> verify loops. Do not chain multiple blind browser actions.
- For clear button/link labels, prefer
page.click_textbefore constructing low-level locators. - Prefer visible text or role-based targets. If a locator is not unique, inspect again and narrow the target instead of guessing.
- After navigation or major actions, use
page.waitwhen a state transition is expected, then re-read the page. - Use
page.screenshotonly as a visual fallback when text/HTML evidence is not enough. - For complex browser tasks, first summarize the current page, then state the next single action, then execute it.
- When
puppeteer-stealthis available, use its browser-use toolchain (chrome.*/browser.*/page.*) for normal business flows instead ofchrome-devtools.
- Prefer repo-local
.codex/skillsand.codex/agents. - Keep work grounded in the AIOS runtime and verification flow.