Sync by metehanozdev · Pull Request #2 · emregucerr/stagehand

metehanozdev · 2025-08-09T01:42:18Z

why

what changed

test plan

# why if no options are provided, we need to provide an empty object # what changed when options is empty, we fallback to empty object # test plan tested locally tested on api  --- ## Summary by cubic Default options to {} for agentExecute calls in V3 to prevent errors when no options are provided. This makes API agent execution more resilient. ## Why: - agentExecute expects an object; undefined options caused failures. ## What: - Use options ?? {} when calling apiClient.agentExecute in V3. ## Test Plan: - [x] Call agent without options; verify it executes without error. - [x] Call agent with options; verify values are passed unchanged. - [x] Smoke test on API with an active page; confirm normal behavior. Written for commit 8980e9f. Summary will update automatically on new commits.

# why # what changed # test plan  --- ## Summary by cubic Fixes the release workflow by enabling OIDC (id-token) permissions and removing the extra npm auth env. This unblocks publish and canary runs on main. ## Why: - Release jobs were failing due to missing OIDC permission and conflicting NODE_AUTH_TOKEN. - We only need GITHUB_TOKEN and id-token for our publish steps. ## What: - Add id-token: write to workflow permissions. - Remove NODE_AUTH_TOKEN from publish and canary steps. - Keep GITHUB_TOKEN for GitHub operations. ## Test Plan: - [ ] Push to main and confirm “Publish Canary” job completes. - [ ] Create a release/tag and confirm “Publish” job completes with no auth errors. - [ ] Verify logs show an OIDC token request (id-token) during publish. - [ ] Confirm the package or canary release appears in the registry/releases. Written for commit 18a81b8. Summary will update automatically on new commits.

# why # what changed # test plan  --- ## Summary by cubic Simplified the canary publish step by removing manual pnpm auth setup in the release workflow. The job now relies on NODE_AUTH_TOKEN from the environment. ## Why: - The pnpm auth config was redundant and could add noisy logs. - Publishing already uses NODE_AUTH_TOKEN via env. ## What: - Removed pnpm config set //registry.npmjs.org/:_authToken=${NODE_AUTH_TOKEN} from the Publish Canary step. ## Test Plan: - [ ] Trigger the release workflow on main. - [ ] Verify canary publish completes successfully. - [ ] Confirm auth is picked up from NODE_AUTH_TOKEN. - [ ] Ensure no token value appears in logs. Written for commit d0908db. Summary will update automatically on new commits.

# why # what changed # test plan  --- ## Summary by cubic Enable npm Trusted Publishing in the release workflow by upgrading to setup-node v4 and installing npm 11.5+ in CI. This should allow tokenless canary publishes from CI. ## Why: - Trusted Publishing requires npm >= 11.5.1. - Our release job used setup-node v3 and an older npm. ## What: - Upgrade actions/setup-node from v3 to v4 in release.yml. - Add step to install latest npm globally (ensures >= 11.5.1). ## Test Plan: - [ ] Run the canary release workflow on this branch. - [ ] Verify npm -v in logs shows >= 11.5.1. - [ ] Confirm publish succeeds without NPM_TOKEN (uses OIDC Trusted Publishing). - [ ] Check the canary publish on npm (version and tag). Written for commit 7ab34e4. Summary will update automatically on new commits.

# why - v2 jump links for `extract` were broken - v3 jump links for `extract` were also broken + included a redundant section # what changed <img width="707" height="797" alt="image" src="https://github.com/user-attachments/assets/9b4bce83-a126-4627-854e-90d2b812b19a" />  --- ## Summary by cubic Fix broken jump links in the v2 and v3 extract docs and remove a redundant section to improve navigation and clarity. - **Bug Fixes** - v2: Updated "prompt-only" anchor to #prompt-only-extraction. - v3: Standardized anchors to #instruction-only and #no-parameters; fixed Card link to #basic-schema. - v3: Removed duplicate "Using extract()" section; renamed to "Return value". Written for commit 29acc0d. Summary will update automatically on new commits.

## Summary Adds callback support to the Stagehand agent for both streaming and non-streaming execution modes, allowing users to hook into various stages of agent execution. ## Changes ### New Types (`lib/v3/types/public/agent.ts`) Added callback interfaces for agent execution: - **`AgentCallbacks`** - Base callbacks shared between modes: - `prepareStep` - Modify settings before each LLM step - `onStepFinish` - Called after each step completes - **`AgentExecuteCallbacks`** - Non-streaming mode callbacks (extends `AgentCallbacks`) - **`AgentStreamCallbacks`** - Streaming mode callbacks (extends `AgentCallbacks`): - `onChunk` - Called for each streamed chunk - `onFinish` - Called when stream completes - `onError` - Called on stream errors - `onAbort` - Called when stream is aborted - **`AgentExecuteOptionsBase`** - Base options without callbacks - **`AgentExecuteOptions`** - Non-streaming options with `AgentExecuteCallbacks` - **`AgentStreamExecuteOptions`** - Streaming options with `AgentStreamCallbacks` ### Handler Updates (`lib/v3/handlers/v3AgentHandler.ts`) - Modified `createStepHandler` to accept optional user callback - Updated `execute()` to pass callbacks to `generateText` - Updated `stream()` to pass callbacks to `streamText` ### Type Safety Added compile-time enforcement that streaming-only callbacks (`onChunk`, `onFinish`, `onError`, `onAbort`) can only be used with `stream: true`: ```typescript // Works - streaming callbacks with stream: true const agent = stagehand.agent({ stream: true }); await agent.execute({ instruction: "...", callbacks: { onChunk: async (chunk) => console.log(chunk) } }); // Compile error - streaming callbacks without stream: true const agent = stagehand.agent({ stream: false }); await agent.execute({ instruction: "...", callbacks: { onChunk: async (chunk) => console.log(chunk) } // Error: "This callback requires 'stream: true' in AgentConfig..." }); ``` ## Type Castings Explained Several type assertions were necessary due to TypeScript's limitations with conditional types: ### 1. Callback extraction in handlers ```typescript const callbacks = (instructionOrOptions as AgentExecuteOptions).callbacks as | AgentExecuteCallbacks | undefined; ``` **Why:** `instructionOrOptions` can be `string | AgentExecuteOptions`. When it's a string, there are no callbacks. We cast after the `prepareAgent` call because at that point we know it's been resolved to options. ### 2. Streaming vs non-streaming branch in v3.ts ```typescript result = await handler.execute( instructionOrOptions as string | AgentExecuteOptions, ); ``` **Why:** The implementation signature accepts `string | AgentExecuteOptions | AgentStreamExecuteOptions` to satisfy both overloads, but within the non-streaming branch we know it's the non-streaming type. TypeScript can't narrow based on the `isStreaming` runtime check. ### 3. Error fallback in stream() ```typescript return { textStream: (async function* () {})(), result: resultPromise, } as unknown as AgentStreamResult; ``` **Why:** When `prepareAgent` fails in streaming mode, we return a minimal object with just `textStream` and `result`. This doesn't satisfy all properties of `StreamTextResult`, but the `result` promise will reject with the actual error. The double cast (`as unknown as`) is needed because TypeScript knows this partial object doesn't match the full type. ## Usage Example ```typescript const agent = stagehand.agent({ stream: true, model: "anthropic/claude-sonnet-4-20250514", }); const result = await agent.execute({ instruction: "Search for something", maxSteps: 20, callbacks: { prepareStep: async (ctx) => { console.log("Preparing step..."); return ctx; }, onStepFinish: async (event) => { console.log(`Step finished: ${event.finishReason}`); if (event.toolCalls) { for (const tc of event.toolCalls) { console.log(`Tool used: ${tc.toolName}`); } } }, onChunk: async (chunk) => { // Process each chunk }, onFinish: (event) => { console.log(`Completed in ${event.steps.length} steps`); }, }, }); for await (const chunk of result.textStream) { process.stdout.write(chunk); } const finalResult = await result.result; console.log(finalResult.message); ``` ## Testing - Added `agent-callbacks.spec.ts` with tests for: - Non-streaming callbacks (`onStepFinish`, `prepareStep`) - Streaming callbacks (`onChunk`, `onFinish`, `prepareStep`, `onStepFinish`) - Combined callback usage - Tool call information in callbacks  --- ## Summary by cubic Adds lifecycle callbacks to the Stagehand agent for both non-streaming and streaming modes, letting users hook into steps, chunks, finish, and errors. Strong type safety and runtime validation prevent using streaming-only callbacks without stream: true; callbacks remain behind experimental. - **New Features** - Added runtime errors in non-streaming mode when onChunk/onFinish/onError/onAbort are provided, with clear messages instructing to set stream: true. Written for commit 15c08d1. Summary will update automatically on new commits.  --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

# why - we currently have `context.addInitScript()`, which adds initialization scripts to all new pages - we want users to be able to add init scripts to individual pages # what changed - adds `page.addInitScript()`, which just wraps & exposes logic that was already in place for `context.addInitScript()`, namely `normalizeInitScriptSources()` and `registerInitScript()` # test plan - added unit tests  --- ## Summary by cubic Adds page.addInitScript() to inject initialization scripts for a single page across navigations. Enables per-page scoping and function args, mirroring Playwright behavior. ## Why: - We only had context.addInitScript() (all pages). Users need per-page control. ## What: - New API: page.addInitScript(script, arg?) - Shared helper: normalizeInitScriptSource in lib/v3/understudy/initScripts.ts, used by page and context - New type: InitScriptSource in internal types - Docs: page reference updated with usage and semantics - Changeset: patch release entry ## Test Plan: - [x] Unit tests cover: - Runs on real navigations - Scoped to the page only - Supports function args - [ ] Manual: override Math.random via page.addInitScript, navigate, confirm it applies only on that page - [ ] Docs: follow the example in page.mdx to verify behavior and API shape Written for commit cf41afd. Summary will update automatically on new commits.

# why adds support for using claude 4.5 opus with cua # what changed added opus to model maps # test plan  --- ## Summary by cubic Adds support for Anthropic Claude 4.5 Opus in CUA. Registers anthropic/claude-opus-4-5-20251101 and maps it to the Anthropic provider. Written for commit 2e54c27. Summary will update automatically on new commits.

@claude

## 🤖 Installing Claude Code GitHub App This PR adds a GitHub Actions workflow that enables Claude Code integration in our repository. ### What is Claude Code? [Claude Code](https://claude.com/claude-code) is an AI coding agent that can help with: - Bug fixes and improvements - Documentation updates - Implementing new features - Code reviews and suggestions - Writing tests - And more! ### How it works Once this PR is merged, we'll be able to interact with Claude by mentioning @claude in a pull request or issue comment. Once the workflow is triggered, Claude will analyze the comment and surrounding context, and execute on the request in a GitHub action. ### Important Notes - **This workflow won't take effect until this PR is merged** - **@claude mentions won't work until after the merge is complete** - The workflow runs automatically whenever Claude is mentioned in PR or issue comments - Claude gets access to the entire PR or issue context including files, diffs, and previous comments ### Security - Our Anthropic API key is securely stored as a GitHub Actions secret - Only users with write access to the repository can trigger the workflow - All Claude runs are stored in the GitHub Actions run history - Claude's default tools are limited to reading/writing files and interacting with our repo by creating comments, branches, and commits. - We can add more allowed tools by adding them to the workflow file like: ``` allowed_tools: Bash(npm install),Bash(npm run build),Bash(npm run lint),Bash(npm run test) ``` There's more information in the [Claude Code action repo](https://github.com/anthropics/claude-code-action). After merging this PR, let's try mentioning @claude in a comment on any PR to get started!  --- ## Summary by cubic Adds GitHub Actions to integrate Claude Code for automated PR reviews and comment-triggered help. Enables @claude to review code and perform tasks using repository context. - **New Features** - claude.yml: Runs when @claude is mentioned in issue/PR comments or reviews, or in issue title/body; uses anthropics/claude-code-action@v1 with actions: read to access CI; requires secrets.ANTHROPIC_API_KEY. - claude-code-review.yml: Auto-reviews PRs on open/sync for quality, bugs, performance, security, and tests; posts feedback via gh; uses claude.md for guidance; includes optional filters and limited allowed tools. - **Migration** - Add ANTHROPIC_API_KEY to repository secrets. - Merge this PR, then mention @claude in a PR or issue to trigger. - Optional: adjust file path filters, author filters, or allowed tools. Written for commit d7e4303. Summary will update automatically on new commits.  --------- Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

@claude

Keeps `@\claude` support but drops the auto PR reviews by claude, we already have plenty of auto-review feedback from greptile and cubic.  --- ## Summary by cubic Removed the Claude PR review GitHub Action to stop automatic reviews. Keeps @claude support and reduces duplicate bot feedback, since greptile and cubic already provide auto-review. Written for commit 740d927. Summary will update automatically on new commits.

# why After the transition to v3, the model handling for agent evals was not updated to account for new model formats # what changed - added isCua flag and two separate model maps to allow for models that can be ran with cua and non - adjusted model handling to properly parse cua models - added tag to distinguish if the run is using cua or non # test plan - tested evals for cua, and non cua  --- ## Summary by cubic Updated the agent evals CLI to support and correctly run both CUA and non-CUA agent models in v3. Fixes agent model parsing and enables mixed eval runs. - **New Features** - Split agent models into standard and CUA lists; added getAgentModelEntries with a cua flag. - Passed isCUA through EvalInput to initV3 and tasks; selects a safe internal model for handlers when CUA. - Improved provider lookup and error messages for CUA models using short names; testcases now tag models as "cua" or "agent". Written for commit 13b906c. Summary will update automatically on new commits.

# why - to clean up the actHandler before #1330  --- ## Summary by cubic Refactors actHandler to centralize LLM action parsing and execution, reduce duplication, and improve metrics reporting. Behavior stays the same, with clearer naming and more reliable two-step and fallback flows. ## Why: - Reduce duplicated LLM calls and normalization logic. - Improve readability and maintainability. - Ensure consistent metrics and variable substitution. - Make the self-heal/fallback path more robust. ## What: - Renamed actFromObserveResult to takeDeterministicAction and updated all call sites (ActCache, AgentCache, v3). - Added getActionFromLLM for inference, metrics, normalization, and variable substitution. - Added recordActMetrics to centralize ACT metrics reporting. - Extracted normalizeActInferenceElement and substituteVariablesInArguments helpers. - Simplified two-step act flow and fallback retry using shared helpers. - Kept existing behavior (selector normalization, variable substitution, retries). ## Test Plan: - [ ] Run unit tests for actHandler to confirm no regressions. - [ ] Verify single-step actions execute as before. - [ ] Verify two-step flow triggers when LLM returns twoStep and executes the second action. - [ ] Confirm fallback self-heal path updates selector and retries successfully. - [ ] Check metrics are recorded once per inference call in both steps and fallback. - [ ] Validate variable substitution replaces %key% tokens in action arguments. - [ ] Exercise AgentCache and ActCache paths to ensure takeDeterministicAction works end-to-end. - [ ] Build passes and type checks for all renamed method references. Written for commit 08d8454. Summary will update automatically on new commits.

@loic-carbonne

great catch from @loic-carbonne # why - currently it is not possible to rerun a cached agent run with a different prompt - therefore, this docs example is misleading # what changed - removed misleading example  --- ## Summary by cubic Removed the incorrect docs example that suggested cached agent workflows can be reused with different inputs. This aligns the deterministic agent page with current behavior where each instruction generates a new cache key, so runs cannot be rerun with a different prompt. Written for commit 4908805. Summary will update automatically on new commits.  Co-authored-by: Loïc Carbonne <loic.carbonne.mail@gmail.com>

…1330) # why - async functions invoked by act, extract, and observe all continued to run even after the timeout was reached # what changed - this PR introduces a time remaining check mechanism which runs between each major IO operation inside each of the handlers - this ensures that user defined timeout are actually respected inside of act, extract, and observe # test plan - added tests to confirm that internal async functions do not continue running after the timeout is reached  --- ## Summary by cubic Fixes act, extract, and observe to truly honor the timeout parameter with step-wise guards that abort early and return clear errors. Deterministic actions now use the same guard path in v3. - **Bug Fixes** - Added createTimeoutGuard and specific ActTimeoutError, ExtractTimeoutError, and ObserveTimeoutError (exported). - Replaced Promise.race with per-step checks across snapshot capture, LLM inference, action execution, and self-heal retries. - Enforced per-step timeouts in ActHandler.takeDeterministicAction; metrics unchanged. - Wired v3 deterministic actions to pass a timeout guard; shadow DOM and unsupported actions behavior unchanged. Written for commit d6bbfb8. Summary will update automatically on new commits.  --------- Co-authored-by: miguel <miguelg71921@gmail.com> Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

# why our slack link expired # what changed updated slack invite link # test plan  --- ## Summary by cubic Replaced the expired Slack invite link with a new working one. Updated the core README and contributing docs so contributors can join the community without broken links. Written for commit 9f0b262. Summary will update automatically on new commits.

# why Users don't know about the v2/v3 version toggle in the docs navigation. # what changed Added a banner at the top of the v3 docs pages to help users easily discover Stagehand Python (v2). # test plan n/a  --- ## Summary by cubic Added a reusable banner to the top of all v3 docs pages to highlight the Stagehand Python (v2) option. Improves discoverability of the v2/v3 toggle and reduces confusion. - **New Features** - Added V3Banner MDX snippet linking to “/v2/first-steps/introduction”. - Imported and rendered the banner across v3 Basics, Best Practices, Configuration, First Steps, Integrations, Migrations, and References pages. - Minor metadata/formatting updates in v2 docs (e.g., User Data frontmatter) for consistency. Written for commit 515a13d. Summary will update automatically on new commits.

# why Anthropic agents in CUA mode are unable to issue key presses (not to be confused with `type` actions) # what changed The format for the anthropic tool `computer_20250124` replies with: ```ts { "action":"key", "text":"BackSpace" } ``` wasn't properly mapped to our internal action abstraction: `keypress`, which accepts parameter `keys`. It was issued directly from the anthropic format. Updated `AnthropicCUAClient.ts` to account for this and map appropriately # test plan - [x] Tested on sample eval  --- ## Summary by cubic Fixes key action mapping in Anthropic CUA so agents can send key presses (e.g., Backspace) correctly instead of failing on the "key" action. - **Bug Fixes** - Map Anthropic "key" to internal "keypress" and pass keys from input.text. - Remove the old "key" path and Playwright key mapping to avoid mismatches. Written for commit b9716b9. Summary will update automatically on new commits.  --------- Co-authored-by: Sean McGuire <75873287+seanmcguire12@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

# why The banner was hard-coded for light mode only # what changed <img width="706" height="316" alt="image" src="https://github.com/user-attachments/assets/64fadf31-a96e-43ae-b435-7082db9b6a64" /> <img width="707" height="314" alt="image" src="https://github.com/user-attachments/assets/515ab34a-f040-4574-89bf-7c2d621a63e6" /> # test plan  --- ## Summary by cubic Fixed the V3 docs banner to support light mode while preserving dark mode styling. Added light-theme border, background, and text colors with dark: variants and aligned link hover states to improve readability. Written for commit 14ab04f. Summary will update automatically on new commits.

…rve/extract, CLICK/HOVER/SCROLL, and CDP (#1283) # why Clarify where the execution flow goes when stagehand runs by showing more detailed logs. <img width="1443" height="529" alt="image" src="https://github.com/user-attachments/assets/1c85f91e-de94-46c3-8226-fe42d4c3e338" /> # what changed Adds a log line printed at the beginning and end of each layer's execution: 1. 🅰 Agent TASK: top-level user intent: when agent.execute('<intent here>') is called (the initial entrypoint) 2. 🆂 Stagehand STEP: any call to .act(...) .extract() or .observe() 3. 🆄 Understudy ACTION: any playwright or browser interaction api action dispatched, e.g. CLICK, HOVER, SCROLL, etc. 4. 🧠 LLM req/resp, 🅲 CDP CALL/Event: any LLM calls or CDP websocket msgs to/from the browser Log lines are written to `./.browserbase/sessions/{sessionId}/{agent,stagehand,understudy,cdp}.log` at runtime, and can be followed in a single unified screen by doing: `tail -f ./.browserbase/sessions/latest/*.log` # test plan Test by running: ```bash # (make sure `OPENAI_API_KEY` and `ANTHROPIC_API_KEY` are both set in env too) export BROWSERBASE_CONFIG_DIR=./.browserbase nano packages/core/examples/flowLoggingJourney.ts # paste in contents (it's just a basic test of the main apis) pnpm tsx packages/core/examples/flowLoggingJourney.ts & tail -f ./.browserbase/sessions/latest/* ``` `flowLoggingJourney.ts`: ```typescript import { Stagehand } from "../lib/v3"; async function run(): Promise<void> { const openaiKey = process.env.OPENAI_API_KEY; const anthropicKey = process.env.ANTHROPIC_API_KEY; if (!openaiKey || !anthropicKey) { throw new Error( "Set both OPENAI_API_KEY and ANTHROPIC_API_KEY before running this demo.", ); } const stagehand = new Stagehand({ env: "LOCAL", verbose: 2, model: { modelName: "openai/gpt-4.1-mini", apiKey: openaiKey }, localBrowserLaunchOptions: { headless: true, args: ["--window-size=1280,720"], }, disablePino: true, }); try { await stagehand.init(); const [page] = stagehand.context.pages(); await page.goto("https://example.com/", { waitUntil: "load" }); // Test standard agent path const agent = stagehand.agent({ systemPrompt: "You are a QA assistant. Keep answers short and deterministic. Finish quickly.", }); const agentResult = await agent.execute( "Glance at the Example Domain page and confirm that you see the hero text.", ); console.log("Agent result:", agentResult); // Test CUA (Computer Use Agent) path await page.goto("https://example.com/", { waitUntil: "load" }); const cuaAgent = stagehand.agent({ cua: true, model: { modelName: "anthropic/claude-sonnet-4-5-20250929", apiKey: anthropicKey, }, }); const cuaResult = await cuaAgent.execute({ instruction: "Click on the 'More information...' link on the page.", maxSteps: 3, }); console.log("CUA Agent result:", cuaResult); const observations = await stagehand.observe("Find any links on the page"); console.log("Observe result:", observations); if (observations.length > 0) { await stagehand.act(observations[0]); } else { await stagehand.act("click the link on the page"); } const extraction = await stagehand.extract( "Summarize the current page title and contents in a single sentence", ); console.log("Extraction result:", extraction); } finally { await stagehand.close({ force: true }).catch(() => {}); } } run().catch((error) => { console.error(error); process.exitCode = 1; }); ``` EXPECTED OUTPUT: ```bash 2025-12-08 12:20:26.23300 ⤑ ⤑ [🆄 #694a GOTO] ▷ Page.goto({args:[https://example.com/,{waitUntil:load}]}) 2025-12-08 12:20:26.23401 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏵ Page.navigate({url:https://example.com/}) 2025-12-08 12:20:26.26402 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStartedNavigating({frameId:8A6B…FE7B,u…rId:F41F…7B31,navigationType:differentDocument}) 2025-12-08 12:20:26.26403 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStartedLoading({frameId:8A6B…FE7B}) 2025-12-08 12:20:26.57304 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏵ Page.setLifecycleEventsEnabled({enabled:true}) 2025-12-08 12:20:26.57605 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.frameNavigated({frame:{id:8A6B…FE7B,loaderI…tIsolated,gatedAPIFeatures:[]},type:Navigation}) 2025-12-08 12:20:26.57706 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Network.policyUpdated({}) 2025-12-08 12:20:26.57807 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Runtime.consoleAPICalled({type:info,args:[{type:…ptId:5,url:",lineNumber:0,columnNumber:2837}]}}) 2025-12-08 12:20:26.57908 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.domContentEventFired({timestamp:545864.312948}) 2025-12-08 12:20:26.58009 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.loadEventFired({timestamp:545864.313355}) 2025-12-08 12:20:26.58110 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStoppedLoading({frameId:8A6B…FE7B}) 2025-12-08 12:20:26.58311 ⤑ ⤑ [🆄 #694a GOTO] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:document.readyState,contextId:2,returnByValue:true}) 2025-12-08 12:20:26.58412 ⤑ ⤑ [🆄 #694a GOTO] ✓ GOTO completed in 0.35s 2025-12-08 12:20:26.58513 [🅰 #1d66] ▷ Agent.execute(Glance at the Example Domain page and confirm that you see the hero text.) 2025-12-08 12:20:26.59314 [🅰 #1d66] ⤑ [🧠 #21e1 LLM] gpt-4.1-mini ⏴ user: Glance at the Example Domain page and confirm that you see the hero text. +{10 tools} 2025-12-08 12:20:29.44715 [🅰 #1d66] ⤑ [🧠 #21e1 LLM] gpt-4.1-mini ↳ ꜛ688 ꜜ12 | tool call: ariaTree() 2025-12-08 12:20:29.44816 [🅰 #1d66] [🆂 #9ac4 EXTRACT] ▷ Stagehand.extract() 2025-12-08 12:20:29.45317 [🅰 #1d66] [🆂 #9ac4 EXTRACT] ⤑ [🅲 #FE7B CDP] ⏵ DOM.getDocument({depth:-1,pierce:true}) 2025-12-08 12:20:29.46018 [🅰 #1d66] [🆂 #9ac4 EXTRACT] ⤑ [🅲 #FE7B CDP] ⏵ Accessibility.getFullAXTree({frameId:8A6B…FE7B}) 2025-12-08 12:20:29.46419 [🅰 #1d66] [🆂 #9ac4 EXTRACT] ✓ EXTRACT completed in 0.02s 2025-12-08 12:20:29.46520 [🅰 #1d66] ⤑ [🧠 #03a1 LLM] gpt-4.1-mini ⏴ tool result: ariaTree(): Accessibility Tre…7] paragraph [0-18] link: Learn more +{10 tools} 2025-12-08 12:20:32.21321 [🅰 #1d66] ⤑ [🧠 #03a1 LLM] gpt-4.1-mini ↳ ꜛ806 ꜜ34 | tool call: close() 2025-12-08 12:20:32.21422 [🅰 #1d66] ✓ Agent.execute() DONE in 5.6s | 2 LLM calls ꜛ1494 ꜜ46 tokens | 6 CDP msgs 2025-12-08 12:20:32.21523 ⤑ ⤑ [🆄 #cb65 GOTO] ▷ Page.goto({args:[https://example.com/,{waitUntil:load}]}) 2025-12-08 12:20:32.21524 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏵ Page.navigate({url:https://example.com/}) 2025-12-08 12:20:32.25425 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStartedNavigating({frameId:8A6B…FE7B,u…rId:2130…4BDE,navigationType:differentDocument}) 2025-12-08 12:20:32.25426 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStartedLoading({frameId:8A6B…FE7B}) 2025-12-08 12:20:32.25727 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏵ Page.setLifecycleEventsEnabled({enabled:true}) 2025-12-08 12:20:32.25828 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ DOM.scrollableFlagUpdated({nodeId:1,isScrollable:false}) 2025-12-08 12:20:32.25929 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.frameNavigated({frame:{id:8A6B…FE7B,loaderI…tIsolated,gatedAPIFeatures:[]},type:Navigation}) 2025-12-08 12:20:32.26030 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Network.policyUpdated({}) 2025-12-08 12:20:32.26031 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ DOM.documentUpdated({}) 2025-12-08 12:20:32.26032 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Runtime.consoleAPICalled({type:info,args:[{type:…ptId:5,url:",lineNumber:0,columnNumber:2837}]}}) 2025-12-08 12:20:32.26133 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ DOM.documentUpdated({}) 2025-12-08 12:20:32.26134 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.domContentEventFired({timestamp:545869.998129}) 2025-12-08 12:20:32.26135 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.loadEventFired({timestamp:545869.998762}) 2025-12-08 12:20:32.26136 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏴ Page.frameStoppedLoading({frameId:8A6B…FE7B}) 2025-12-08 12:20:32.26237 ⤑ ⤑ [🆄 #cb65 GOTO] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:document.readyState,contextId:3,returnByValue:true}) 2025-12-08 12:20:32.26338 ⤑ ⤑ [🆄 #cb65 GOTO] ✓ GOTO completed in 0.05s 2025-12-08 12:20:32.26339 [🅰 #c756] ▷ Agent.execute({instruction:Click on the More information... link on the page.,maxSteps:3}) 2025-12-08 12:20:32.26440 [🅰 #c756] ⤑ ⤑ [🅲 #FE7B CDP] ⏵ Page.addScriptToEvaluateOnNewDocument({source:(() => …ue });\n setTimeout(install, 100);\n }\n })();}) 2025-12-08 12:20:32.26441 [🅰 #c756] ⤑ ⤑ [🅲 #FE7B CDP] ⏴ Accessibility.loadComplete({root:{nodeId:23,ignored:f…ds:[24],backendDOMNodeId:23,frameId:8A6B…FE7B}}) 2025-12-08 12:20:32.26542 [🅰 #c756] ⤑ ⤑ [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:({ w: window.innerWidth,…ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:32.26543 [🅰 #c756] ⤑ ⤑ [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() => {\n const ID = __… 100);\n }\n })();,includeCommandLineAPI:false}) 2025-12-08 12:20:32.26744 [🅰 #c756] ⤑ [🧠 #2798 LLM] claude-sonnet-4-5-20250929 ⏴ Click on the More information... link on the page. 2025-12-08 12:20:36.15745 [🅰 #c756] ⤑ [🧠 #2798 LLM] claude-sonnet-4-5-20250929 ↳ ꜛ1875 ꜜ79 | Ill help you click on the More information... l tool_use:computer 2025-12-08 12:20:36.96146 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] ▷ Page.screenshot({args:[{fullPage:false}]}) 2025-12-08 12:20:36.96447 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:36.96648 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] [🅲 #FE7B CDP] ⏵ Page.captureScreenshot({format:png,fromSurface:true,captureBeyondViewport:false}) 2025-12-08 12:20:37.01149 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:37.01250 [🅰 #c756] ⤑ [🆄 #f55d SCREENSHOT] ✓ SCREENSHOT completed in 0.05s 2025-12-08 12:20:37.01251 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] ▷ Page.screenshot({args:[{fullPage:false}]}) 2025-12-08 12:20:37.01352 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:37.01453 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Page.captureScreenshot({format:png,fromSurface:true,captureBeyondViewport:false}) 2025-12-08 12:20:37.04054 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:37.04155 [🅰 #c756] ⤑ [🆄 #cce8 SCREENSHOT] ✓ SCREENSHOT completed in 0.03s 2025-12-08 12:20:37.04156 [🅰 #c756] ⤑ [🧠 #ce80 LLM] claude-sonnet-4-5-20250929 ⏴ Current URL: https://example.com/ +{15.8kb image} 2025-12-08 12:20:44.82757 [🅰 #c756] ⤑ [🧠 #ce80 LLM] claude-sonnet-4-5-20250929 ↳ ꜛ3192 ꜜ192 | I can see a pag…ith Example Domain as the head tool_use:computer 2025-12-08 12:20:45.12958 [🅰 #c756] ⤑ [🆄 #f8c3 V3CUA.SCROLL] ▷ v3CUA.scroll({target:(644, 400),args:[{type:sc…scroll_amount:3,pageUrl:https://example.com/}]}) 2025-12-08 12:20:45.12959 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:typeof w…"undefined\"&&window.__v3Cursor.move(644, 400)}) 2025-12-08 12:20:45.12960 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] ▷ Page.scroll({args:[644,400,0,300]}) 2025-12-08 12:20:45.13061 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] [🅲 #FE7B CDP] ⏵ Input.dispatchMouseEvent({type:mouseMoved,x:644,y:400,button:none}) 2025-12-08 12:20:45.13762 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] [🅲 #FE7B CDP] ⏵ Input.dispatchMouseEvent({type:mouseW…el,x:644,y:400,button:none,deltaX:0,deltaY:300}) 2025-12-08 12:20:45.14663 [🅰 #c756] ⤑ [🆄 #3fc9 SCROLL] ✓ SCROLL completed in 0.02s 2025-12-08 12:20:45.64764 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] ▷ Page.screenshot({args:[{fullPage:false}]}) 2025-12-08 12:20:45.64965 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:45.65266 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Page.captureScreenshot({format:png,fromSurface:true,captureBeyondViewport:false}) 2025-12-08 12:20:45.68567 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:45.68668 [🅰 #c756] ⤑ [🆄 #ccb0 SCREENSHOT] ✓ SCREENSHOT completed in 0.04s 2025-12-08 12:20:45.68769 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] ▷ Page.screenshot({args:[{fullPage:false}]}) 2025-12-08 12:20:45.68770 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:45.68971 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Page.captureScreenshot({format:png,fromSurface:true,captureBeyondViewport:false}) 2025-12-08 12:20:45.71372 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] [🅲 #FE7B CDP] ⏵ Runtime.evaluate({expression:(() …ntextId:3,awaitPromise:true,returnByValue:true}) 2025-12-08 12:20:45.71473 [🅰 #c756] ⤑ [🆄 #87f4 SCREENSHOT] ✓ SCREENSHOT completed in 0.03s 2025-12-08 12:20:45.71474 [🅰 #c756] ⤑ [🧠 #ed51 LLM] claude-sonnet-4-5-20250929 ⏴ Current URL: https://example.com/ +{15.8kb image} ``` --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nick Sweeting <pirate@users.noreply.github.com> Co-authored-by: Miguel <36487034+miguelg719@users.noreply.github.com>

# why Stand up a Fastify Stagehand server we can reuse for thin-client SDKs across multiple languages. # what changed created new fastify server # test Plan - Start the Fastify server (pnpm --filter server dev or your usual command). - Local browser smoke: MODEL_API_KEY=... ./scripts/test_local_browser.sh - Browserbase smoke: MODEL_API_KEY=... BROWSERBASE_API_KEY=... BROWSERBASE_PROJECT_ID=... ./scripts/test_remote_browser.sh.  --- ## Summary by cubic Adds a new Fastify-based Stagehand API server exposing V3 browser automation over REST with streaming responses and session management. Supports both local Chrome and Browserbase, includes health/readiness endpoints, and ships an OpenAPI spec. - **New Features** - New packages/server with REST routes: start, navigate, observe, act, extract, agentExecute, end (streaming logs/results) - In-memory LRU session store with TTL, lazy V3 init, and cleanup on end - Local and Browserbase browsers; credentials passed via headers - Health (/healthz) and readiness (/readyz), metrics, and structured request logging - OpenAPI v3 spec and README - Removed v2 code and DB dependency; auth currently disabled - **Migration** - Run: pnpm --filter @browserbasehq/stagehand-server dev - Required header: x-model-api-key; for Browserbase also x-bb-api-key and x-bb-project-id Written for commit ed1089b. Summary will update automatically on new commits.

# Agent Abort Signal and Message Continuation ## Why Enable users to cancel long-running agent tasks and continue conversations across multiple `execute()` calls. Also ensures graceful shutdown when `stagehand.close()` is called by automatically aborting any running agent tasks. ## What Changed ### New Features (behind `experimental: true`) #### Abort Signal Support - Pass `signal` to `agent.execute()` to cancel execution mid-run - Works with `AbortController` and `AbortSignal.timeout()` - Throws `AgentAbortError` when aborted #### Message Continuation - `execute()` now returns `messages` in the result - Pass previous messages to continue a conversation across calls ### New Utilities | File | Purpose | |---------------------------------|-------------------------------------------------------------------------------------------| | `combineAbortSignals.ts` | Merges multiple signals (uses native `AbortSignal.any()` on Node 20+, fallback for older) | | `errorHandling.ts` | Consolidates abort detection logic—needed because `close()` may cause indirect errors (e.g., null context) that should still be treated as abort | | `validateExperimentalFeatures.ts` | Single place for all experimental/CUA feature validation | ### CUA Limitations Abort signal and message continuation are not supported with CUA mode (throws `StagehandInvalidArgumentError`). This matches existing streaming limitation. ### Tests Added - `agent-abort-signal.spec.ts` (7 tests) - `agent-message-continuation.spec.ts` (4 tests) - `agent-experimental-validation.spec.ts` (17 tests)  --- ## Summary by cubic Adds agent abort support and conversation continuation. You can cancel long runs, auto-abort on close, and carry messages across execute() calls. Feature is gated behind experimental: true and has clear CUA limitations. - **New Features** - Abort signal for execute() and stream() with AbortController and AbortSignal.timeout; throws AgentAbortError; stagehand.close() auto-aborts via an internal controller combined with any user signal. - Message continuation: execute() returns messages and accepts previous messages on the next call; tool calls and results are included. - **Refactors** - Centralized experimental/CUA validation via validateExperimentalFeatures: CUA disallows streaming, abort signal, and message continuation; experimental required for integrations, tools, streaming, callbacks, signal, and messages. - Public API updates: re-export ModelMessage; Agent types include messages and signal; AgentAbortError exported for consistent abort typing. Written for commit 5276e41. Summary will update automatically on new commits.  --------- Co-authored-by: Nick Sweeting <github@sweeting.me>

# why Click count in CDP's [Input.dispatchMouseEvent](https://chromedevtools.github.io/devtools-protocol/tot/Input/#method-dispatchMouseEvent) does **not** issue multiple click events, is mainly kept for tracking. Individual `mousePressed`/`mouseReleased` events must be sent # what changed Added a for loop for the `clickCount` number provided in both `locator.click()` and `page.click()`. Also built redundancy around `AnthropicCUAClient` double_click coordinate parsing. # test plan - [x] tested on https://doubleclicktest.com/ - [x] added evals site and unit tests on `click-count.spec.ts`  --- ## Summary by cubic Fixes multiple-click behavior by dispatching individual mousePressed/mouseReleased events per click and normalizes Anthropic CUA doubleClick coordinates. Double-clicks and multi-clicks now work reliably via CDP and CUA. - **Bug Fixes** - locator.click and page.click now loop over clickCount, sending pressed/released pairs for each click. - AnthropicCUAClient parses doubleClick consistently and falls back to coordinate arrays when x/y are missing. - Added tests for single, double, and triple clicks for locator.click and page.click. Written for commit 26b784d. Summary will update automatically on new commits.

# why - Google CUA agent was crashing with `Cannot read properties of undefined (reading 'parts')` - This can happen when the model's response is blocked due to safety filters, rate limiting, or other API-level issues # what changed - Added a null check for `candidate.content` and `candidate.content.parts` in `GoogleCUAClient.processResponse()` - When content is missing, the agent now gracefully returns with the finishReason logged for debugging  --- ## Summary by cubic Fixed crash in the Google CUA agent when Gemini returns an empty or blocked response. We now guard against missing content, log the finishReason, and return a safe, completed response with no actions or function calls. Written for commit 5309757. Summary will update automatically on new commits.

# why ci test failed due to timeout being hit on 1/3 ci runs unsure if this will fail again, but increasing delay to prevent in the future # what changed increased timeout from 10s to 20s  --- ## Summary by cubic Increased the test timeout from 10s to 20s in agent-abort-signal.spec to reduce CI flakiness and avoid false timeouts on slower runs. Written for commit 8d3c418. Summary will update automatically on new commits.

# why These dev dependencies don't belong here. Some are no longer used, some should go into their respective packages # what changed Moved dev dependencies to respective packages and removed unused ones # test plan  --- ## Summary by cubic Moved dev dependencies from the workspace root into the packages that use them and removed unused ones to cut install bloat. - **Dependencies** - Removed unused devDependencies from the root; moved required ones into packages/core and packages/evals. - Added missing dev deps to packages/core (@types/adm-zip, @types/node, @types/ws, adm-zip, chalk, esbuild) and packages/evals (braintrust, chalk, string-comparison). - Cleaned pnpm-lock.yaml (large reduction in entries). Written for commit c6f6221. Summary will update automatically on new commits.

# why - writing base64 screenshots to disk is unnecessary: screenshots do not get replayed, so there is no sense in writing it to disk # what changed - added a `pruneAgentResult()` fn which prunes the screenshot entry before it is written to disk # test plan - existing tests & evals should suffice for this one  --- ## Summary by cubic Stop writing base64 screenshots to the agent cache to reduce disk usage and keep cache entries lean. Screenshots aren’t replayed, so pruning them has no impact on behavior. - **Refactors** - Added pruneAgentResult to remove screenshot base64 blobs from actions before persisting. - Prunes only the cached copy; the live AgentResult returned to callers is unchanged. Written for commit 625f982. Summary will update automatically on new commits.

# why - `extract()` was missing from `stagehand.history()` - addresses #1357 # what changed - added a call to `addToHistory()` after `extract()` finishes # test plan  --- ## Summary by cubic Include extract() in stagehand.history() so extract actions and results are tracked with instruction, selector, timeout, and schema details. Fixes missing history entries for extract and addresses #1357. Written for commit 84f95db. Summary will update automatically on new commits.

# why - to enforce best practices in error handling by requiring the cause of a given error to be included in the error that gets thrown # what changed - added the `preserve-caught-error` lint rule - updated calls that were not preserving caught errors - also bumped eslint version & typescript eslint for compatibility with the new rule  --- ## Summary by cubic Adds an ESLint rule to require preserving the original error when rethrowing, updates code to pass error causes, and fixes related linting issues. This improves debugging by linking errors via the cause option. - **Dependencies** - ESLint → 10.0.2 - @eslint/js → 10.0.1 - typescript-eslint → 8.56.1 Written for commit b4cd503. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1745">Review in cubic</a>

# why Once API-side caching is deployed, the feature will be opt-out. Users should have a way to disable the cache from the client. # what changed Added a `serverCache` parameter to the constructor which adds the `x-bb-skip-cache` header to all outbound API requests. This parameter is also present on act, extract, and observe functions to allow for further configuration. # test plan  --- ## Summary by cubic Adds global and per-request control of server-side caching for act/extract/observe in Browserbase mode. Caching is on by default; you can opt out. HIT/MISS is logged, and results include cacheStatus (Linear STG-1182). - **New Features** - Added serverCache to V3Options and act/extract/observe options (default true). Sends browserbase-cache-bypass when false, reads browserbase-cache-status and SSE cacheHit, logs HIT/MISS, and sets cacheStatus on ActResult/ExtractResult/ObserveResult. Observe now returns ObserveResult (Action[] with optional cacheStatus). - Updated public types and docs. Expanded tests with server integration checks for bypass header behavior and cache-status semantics (including HIT on repeated identical requests), plus ActCache variable-key handling/replay. - **Bug Fixes** - Fixed cache miss logging when variables are missing during ActCache replay, and suppressed cache status logging when a request explicitly bypasses the cache. Written for commit b35d29e. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1581">Review in cubic</a>   --- > [!NOTE] > **Medium Risk** > Touches the Browserbase API request/response path (new headers, SSE parsing behavior, and result type shape), which could affect caching behavior or backwards compatibility for consumers relying on exact response objects. > > **Overview** > Adds **server-side caching support for Browserbase API mode** with a new `serverCache` flag on `V3Options` (instance default) and per-call overrides in `act()`, `extract()`, and `observe()`. > > `StagehandAPIClient` now sends `browserbase-cache-bypass: true` when caching is disabled, reads `browserbase-cache-status` (and SSE `cacheHit`) to log HIT/MISS, and attaches `cacheStatus?: "HIT" | "MISS"` onto `ActResult` and `ExtractResult`. > > Updates public types/tests, adds new unit tests for `ActCache` variable-key behavior, and refreshes docs + changeset to announce server-side caching and the opt-out mechanism. > > Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit b5ce917. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).

@tkattkat

This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @browserbasehq/stagehand@3.1.0 ### Minor Changes - [#1681](#1681) [`e3db9aa`](e3db9aa) Thanks [@tkattkat](https://github.com/tkattkat)! - Add cookie management APIs: `context.addCookies()`, `context.clearCookies()`, & `context.cookies()` - [#1672](#1672) [`b65756e`](b65756e) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add boolean keepAlive parameter to allow for configuring whether the browser should be closed when stagehand.close() is called. - [#1708](#1708) [`176d420`](176d420) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - add context.setExtraHTTPHeaders() - [#1611](#1611) [`8a3c066`](8a3c066) Thanks [@monadoid](https://github.com/monadoid)! - Using `mode` enum instead of old `cua` boolean in openapi spec ### Patch Changes - [#1683](#1683) [`7584f3e`](7584f3e) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix: include shadow DOM in .count() & .nth() & support xpath predicates - [#1644](#1644) [`1e1c9c1`](1e1c9c1) Thanks [@monadoid](https://github.com/monadoid)! - Fix unhandled CDP detaches by returning the original sendCDP promise - [#1729](#1729) [`6bef890`](6bef890) Thanks [@shrey150](https://github.com/shrey150)! - fix: support Claude 4.6 (Opus and Sonnet) in CUA mode by using the correct `computer_20251124` tool version and `computer-use-2025-11-24` beta header - [#1647](#1647) [`ffd4b33`](ffd4b33) Thanks [@tkattkat](https://github.com/tkattkat)! - Fix [Agent] - Address bug causing issues with continuing a conversation from past messages in dom mode - [#1614](#1614) [`677bff5`](677bff5) Thanks [@miguelg719](https://github.com/miguelg719)! - Enforce <number>-<number> regex validation on act/observe for elementId - [#1580](#1580) [`65ff464`](65ff464) Thanks [@tkattkat](https://github.com/tkattkat)! - Add unified variables support across act and agent with a single VariableValue type - [#1666](#1666) [`101bcf2`](101bcf2) Thanks [@Kylejeong2](https://github.com/Kylejeong2)! - add support for codex models - [#1728](#1728) [`0a94301`](0a94301) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - handle potential race condition on `.close()` when using the Stagehand API - [#1664](#1664) [`b27c04d`](b27c04d) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fixes issue with context.addInitScript() where scripts were not being applied to out of process iframes (OOPIFs), and popup pages with same process iframes (SPIFs) - [#1632](#1632) [`afbd08b`](afbd08b) Thanks [@pirate](https://github.com/pirate)! - Remove automatic `.env` loading via `dotenv`. If your app relies on `.env` files, install `dotenv` and load it explicitly in your code: ```ts import dotenv from "dotenv"; dotenv.config({ path: ".env" }); ``` - [#1624](#1624) [`0e8d569`](0e8d569) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix issue where screenshot masks were not being applied to dialog elements - [#1596](#1596) [`ff0f979`](ff0f979) Thanks [@tkattkat](https://github.com/tkattkat)! - Update usage/metrics handling in agent - [#1631](#1631) [`2d89d2b`](2d89d2b) Thanks [@miguelg719](https://github.com/miguelg719)! - Add right and middle click support to act and observe - [#1697](#1697) [`aac9a19`](aac9a19) Thanks [@shrey150](https://github.com/shrey150)! - fix: support `<frame>` elements in XPath frame boundary detection so `act()` works on legacy `<frameset>` pages - [#1692](#1692) [`06de50f`](06de50f) Thanks [@shrey150](https://github.com/shrey150)! - fix: skip piercer injection for chrome-extension:// and other non-HTML targets - [#1613](#1613) [`aa4d981`](aa4d981) Thanks [@miguelg719](https://github.com/miguelg719)! - SupportedUnderstudyAction Enum validation for 'method' on act/observe inference - [#1652](#1652) [`18b1e3b`](18b1e3b) Thanks [@miguelg719](https://github.com/miguelg719)! - Add support for gemini 3 flash and pro in hybrid/cua agent - [#1706](#1706) [`957d82b`](957d82b) Thanks [@chrisreadsf](https://github.com/chrisreadsf)! - Add GLM to prompt-based JSON fallback for models without native structured output support - [#1633](#1633) [`22e371a`](22e371a) Thanks [@tkattkat](https://github.com/tkattkat)! - Add warning when incorrect models are used with agents hybrid mode - [#1673](#1673) [`d29b91f`](d29b91f) Thanks [@miguelg719](https://github.com/miguelg719)! - Add multi-region support for Stagehand API with region-specific endpoints - [#1695](#1695) [`7b4f817`](7b4f817) Thanks [@tkattkat](https://github.com/tkattkat)! - Fix: zod bug when pinning zod to v3 and using structured output in agent - [#1609](#1609) [`3f9ca4d`](3f9ca4d) Thanks [@miguelg719](https://github.com/miguelg719)! - Add SupportedUnderstudyActions to observe system prompt - [#1581](#1581) [`49ead1e`](49ead1e) Thanks [@sameelarif](https://github.com/sameelarif)! - **Server-side caching is now available.** When running `env: "BROWSERBASE"`, Stagehand automatically caches `act()`, `extract()`, and `observe()` results server-side — repeated calls with the same inputs return instantly without consuming LLM tokens. Caching is enabled by default and can be disabled via `serverCache: false` on the Stagehand instance or per individual call. Check out the [browserbase blog](https://www.browserbase.com/blog/stagehand-caching) for more details. - [#1642](#1642) [`3673369`](3673369) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix issue where scripts added via context.addInitScripts() were not being injected into new pages that were opened via popups (eg, clicking a link that opens a new page) and/or calling context.newPage(url) - [#1735](#1735) [`c465e87`](c465e87) Thanks [@monadoid](https://github.com/monadoid)! - Supports request header authentication with connectToMCPServer - [#1705](#1705) [`ae533e4`](ae533e4) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - include error cause in UnderstudyCommandException - [#1636](#1636) [`ea33052`](ea33052) Thanks [@miguelg719](https://github.com/miguelg719)! - Include executionModel on the AgentConfigSchema - [#1679](#1679) [`5764ede`](5764ede) Thanks [@shrey150](https://github.com/shrey150)! - fix issue where locator.count() was not working with xpaths that have attribute predicates - [#1646](#1646) [`f09b184`](f09b184) Thanks [@miguelg719](https://github.com/miguelg719)! - Add user-agent to CDP connections - [#1637](#1637) [`a7d29de`](a7d29de) Thanks [@miguelg719](https://github.com/miguelg719)! - Improve error and warning message for legacy model format - [#1685](#1685) [`d334399`](d334399) Thanks [@tkattkat](https://github.com/tkattkat)! - Bump ai sdk & google provider version - [#1662](#1662) [`44416da`](44416da) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix issue where locator.fill() was not working on elements that require direct value setting - [#1612](#1612) [`bdd8b4e`](bdd8b4e) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - fix issue where screenshot mask was only being applied to the first element that the locator resolved to. masks now apply to all matching elements. ## @browserbasehq/stagehand-server@3.6.0 ### Minor Changes - [#1611](#1611) [`8a3c066`](8a3c066) Thanks [@monadoid](https://github.com/monadoid)! - Using `mode` enum instead of old `cua` boolean in openapi spec ### Patch Changes - [#1604](#1604) [`4753078`](4753078) Thanks [@miguelg719](https://github.com/miguelg719)! - Enable bedrock - [#1636](#1636) [`ea33052`](ea33052) Thanks [@miguelg719](https://github.com/miguelg719)! - Include executionModel on the AgentConfigSchema - [#1602](#1602) [`22a0502`](22a0502) Thanks [@miguelg719](https://github.com/miguelg719)! - Include vertex as a supported provider - Updated dependencies \[[`7584f3e`](7584f3e), [`1e1c9c1`](1e1c9c1), [`6bef890`](6bef890), [`ffd4b33`](ffd4b33), [`677bff5`](677bff5), [`65ff464`](65ff464), [`101bcf2`](101bcf2), [`0a94301`](0a94301), [`b27c04d`](b27c04d), [`afbd08b`](afbd08b), [`e3db9aa`](e3db9aa), [`0e8d569`](0e8d569), [`ff0f979`](ff0f979), [`2d89d2b`](2d89d2b), [`aac9a19`](aac9a19), [`06de50f`](06de50f), [`aa4d981`](aa4d981), [`18b1e3b`](18b1e3b), [`957d82b`](957d82b), [`b65756e`](b65756e), [`22e371a`](22e371a), [`d29b91f`](d29b91f), [`7b4f817`](7b4f817), [`176d420`](176d420), [`3f9ca4d`](3f9ca4d), [`8a3c066`](8a3c066), [`49ead1e`](49ead1e), [`3673369`](3673369), [`c465e87`](c465e87), [`ae533e4`](ae533e4), [`ea33052`](ea33052), [`5764ede`](5764ede), [`f09b184`](f09b184), [`a7d29de`](a7d29de), [`d334399`](d334399), [`44416da`](44416da), [`bdd8b4e`](bdd8b4e)]: - @browserbasehq/stagehand@3.1.0 ## @browserbasehq/stagehand-evals@1.1.8 ### Patch Changes - Updated dependencies \[[`7584f3e`](7584f3e), [`1e1c9c1`](1e1c9c1), [`6bef890`](6bef890), [`ffd4b33`](ffd4b33), [`677bff5`](677bff5), [`65ff464`](65ff464), [`101bcf2`](101bcf2), [`0a94301`](0a94301), [`b27c04d`](b27c04d), [`afbd08b`](afbd08b), [`e3db9aa`](e3db9aa), [`0e8d569`](0e8d569), [`ff0f979`](ff0f979), [`2d89d2b`](2d89d2b), [`aac9a19`](aac9a19), [`06de50f`](06de50f), [`aa4d981`](aa4d981), [`18b1e3b`](18b1e3b), [`957d82b`](957d82b), [`b65756e`](b65756e), [`22e371a`](22e371a), [`d29b91f`](d29b91f), [`7b4f817`](7b4f817), [`176d420`](176d420), [`3f9ca4d`](3f9ca4d), [`8a3c066`](8a3c066), [`49ead1e`](49ead1e), [`3673369`](3673369), [`c465e87`](c465e87), [`ae533e4`](ae533e4), [`ea33052`](ea33052), [`5764ede`](5764ede), [`f09b184`](f09b184), [`a7d29de`](a7d29de), [`d334399`](d334399), [`44416da`](44416da), [`bdd8b4e`](bdd8b4e)]: - @browserbasehq/stagehand@3.1.0 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

# why - adds documentation for the following new functions: - `context.addCookies()` - `context.clearCookies()` - `context.cookies()` ### note: - hold off on merging until after release  --- ## Summary by cubic Add V3 docs for cookie management in context: cookies(), addCookies(), and clearCookies, with examples and type definitions to help users manage auth and state. Connects to Linear STG-1409; merge after the release that includes these APIs. Written for commit 4ad34e1. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1748">Review in cubic</a>

# why - adds documentation for the new `keepAlive` param ### note: - hold off on merging until after release  --- ## Summary by cubic Add keepAlive docs to Stagehand, detailing behavior in Browserbase and Local, close() semantics, defaults/overrides, and how to reconnect via browserbaseSessionID. Addresses Linear STG-1380. Written for commit 6eaa9cd. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1747">Review in cubic</a>

…k to opening about:blank (#1749) # why Stagehand API failures were caused on all routes `/act`, `/observe`, etc. when session init failed because a page was closed. This adds a fallback to open about:blank so at least the session remains usable instead of just crashing on init. # what changed # test plan  --- ## Summary by cubic Fixes session init when connecting to a browser with zero pages by auto-creating an about:blank tab. Keeps Stagehand routes working after the last page closes and surfaces lint failures via a dedicated cancellation job. Addresses STG-1450. - **Bug Fixes** - Replace waitForFirstTopLevelPage with ensureFirstTopLevelPage: check for an existing page, wait with timeout, then create about:blank if none. - Simplify errors to a TimeoutError with clearer context; improve logging and add an integration test for zero-page init. - **CI** - Run cancellation in a separate job only when lint fails, so failures are visible instead of marked skipped. Written for commit 45ed6dc. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1749">Review in cubic</a>   Mintlify --- 0 threads from 0 users in Mintlify - No unresolved comments   <a href="https://dashboard.mintlify.com/browserbase/stagehand/editor/session-init-fixes?source=pr_comment" target="_blank" rel="noopener noreferrer"><picture><source media="(prefers-color-scheme: dark)" srcset="https://d3gk2c5xim1je2.cloudfront.net/assets/open-mintlify-editor-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://d3gk2c5xim1je2.cloudfront.net/assets/open-mintlify-editor-light.svg"><img src="https://d3gk2c5xim1je2.cloudfront.net/assets/open-mintlify-editor-light.svg" alt="Open in Mintlify Editor"></picture></a>

# why Allows java / kotlin publishing # what changed minimal stainless config change as per stainless docs # test plan none  --- ## Summary by cubic Configured Stainless to publish Java and Kotlin packages to Maven Central via Sonatype Portal. Moves STG-1307 forward by enabling publishing for these two languages. - **New Features** - Set maven.sonatype_platform: portal for Java and Kotlin in stainless.yml. Written for commit a2f1a10. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1757">Review in cubic</a>  Co-authored-by: ci-test <ci-test@example.com>

…1759) ## Summary - Adds `"bedrock"` to the `provider` enum in `ModelConfigObjectSchema` and `AgentConfigSchema` (Zod schemas in `packages/core/lib/v3/types/public/api.ts`) - Regenerates `packages/server/openapi.v3.yaml` via `pnpm gen:openapi` ## Context Bedrock was added to `AISDK_PROVIDERS` and `LLMProvider` in PRs #1604 and #1617, but the Zod schemas that feed the OpenAPI spec (and ultimately the Stainless-generated SDKs) were never updated. This means the Python/Go/etc. SDK type definitions don't include `"bedrock"` as a valid provider option. Companion PR in bb/core: browserbase/core#7668 ## Test plan - [x] `pnpm gen:openapi` produces updated spec with `bedrock` in all 4 provider enum locations - [ ] Stainless picks up the OpenAPI change and regenerates SDKs with `bedrock` in the `provider` literal type 🤖 Generated with [Claude Code](https://claude.com/claude-code)  --- ## Summary by cubic Add "bedrock" to provider enums in ModelConfigObjectSchema/AgentConfigSchema and regenerate the OpenAPI spec so SDKs accept it and stay in sync with AISDK_PROVIDERS/LLMProvider. Also update AgentType to include "bedrock" to fix a server build type mismatch, and add a patch changeset to publish and unblock Python/Go SDK type generation. Written for commit 88e3372. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1759">Review in cubic</a>

# why - this function was from legacy stagehand which only operated on one page - presently, it was only being used to produce a log which: - at best, misinformed users on whether the page had actually navigated, and, - at worst, resulted in a noisy error log - the error log would happen if `clickElement()` triggered page closure. this means that the frame.evaluate() to get the URL would attempt `.evaluate()` on a frame that no longer existed # what changed - removed `handlePossibleNavigation()` # test plan - existing tests are fine here  --- ## Summary by cubic Removed the legacy handlePossibleNavigation() that tried to detect navigation by URL and produced misleading logs. This also prevents errors when clicks close the page and evaluate runs on a non-existent frame, reducing log noise. Written for commit 1d2c3d6. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1761">Review in cubic</a>

# what changed - added documentation for the `context.setExtraHTTPHeaders()` function  --- ## Summary by cubic Add v3 docs for context.setExtraHTTPHeaders(), including API, context-wide behavior (applies to all pages, replaces not merges, clear via {}), examples, and error docs. Also updates the V3Context interface to include this method; addresses Linear STG-1414. Written for commit c6f64ee. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1762">Review in cubic</a>

Fixes CE-731 ## Summary - Remove Claude 3.5 Sonnet (`claude-3-5-sonnet-latest`, `-20241022`, `-20240620`) and Claude 3.7 Sonnet (`claude-3-7-sonnet-latest`, `-20250219`) from all supported model lists - These models are **retired** by Anthropic — API calls to them will fail - Replace with `claude-sonnet-4-20250514` across evals, CI, docs, and examples ## What changed (27 files) - **Core types**: Removed from `model.ts` type union, `agent.ts` CUA models list - **Provider mappings**: Removed from `LLMProvider.ts`, `AgentProvider.ts`, server `utils.ts`, server `model.ts` - **Evals/CI**: Updated `taskConfig.ts`, `initV3.ts`, `ci.yml`, `.env.example` to use `claude-sonnet-4-20250514` - **Tests**: Updated `model-deprecation.test.ts` and `llm-and-agents.test.ts` (513/513 pass) - **Docs**: Updated all v2 and v3 documentation references (11 `.mdx` files) - **Other**: Issue template, MCP example ## Context Per [Anthropic's model deprecations page](https://docs.anthropic.com/en/docs/resources/model-deprecations): | Model | Retired | |-------|---------| | `claude-3-5-sonnet-20240620` | Oct 28, 2025 | | `claude-3-5-sonnet-20241022` | Oct 28, 2025 | | `claude-3-7-sonnet-20250219` | Feb 19, 2026 | ## Test plan - [x] All 513 unit tests pass (`pnpm exec turbo run test:core`) - [x] `grep` confirms zero remaining references outside CHANGELOG.md (historical) - [ ] Verify CI passes 🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Summary - expose `headers` on `GoogleVertexProviderSettings` in Stagehand public model types - add a public API type test proving model configs with headers are accepted for google/openai/anthropic - add a patch changeset ## Context Runtime already forwards provider options to `createVertex()`, but TypeScript rejected `headers` in model config. This aligns public types with runtime behavior. ## Validation - `pnpm -C packages/core run typecheck` - `pnpm -C packages/core run build:esm` - `pnpm -C packages/core run test:core -- packages/core/dist/esm/tests/unit/public-api/llm-and-agents.test.js` - `pnpm -C packages/core run test:core -- packages/core/dist/esm/tests/unit/llm-provider.test.js`  --- ## Summary by cubic Expose the headers field on GoogleVertexProviderSettings in the public model config types so custom provider headers (e.g., X-Goog-Priority) are accepted without TypeScript errors. Updated the public API type test to cover Vertex headers and align the model config check with the public API style, keeping types consistent with runtime behavior. Written for commit bf4907d. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1764">Review in cubic</a>

…l Cache sections (#1770) ## Summary Restructures the caching best practices docs page into two clear sections: ### Changes - **Removed** the disclaimer/note about server-side caching only working with `env: "BROWSERBASE"` — this is now naturally conveyed in the section description - **Renamed** "Server-side Caching" → **"Browserbase Cache"** with a clear description of what it is (managed, server-side, automatic, zero-config) - **Renamed** "Local Caching" → **"Local Cache"** with a clear description of what it is (file-based, works everywhere, portable) - **Added** use-case bullets to the Local Cache section explaining when to reach for it (agent workflows, CI/CD, local dev, cross-machine sharing) - **Preserved** all existing code snippets, configuration examples, and best practices ### What stays the same - All code examples (disabling on constructor, disabling per call, inspecting cache status, act/agent caching, cache directory organization) - The limitations section for Browserbase Cache - The best practices accordion (descriptive dirs, clearing cache, committing for CI/CD) - The blog link for deeper technical details Only modifies `packages/docs/v3/best-practices/caching.mdx`. Linear: https://linear.app/browserbase/issue/STG-1482

…ction time (#1719) # why Init script injection was racing with Debugger.resume() sometimes, causing frames to load without init scripts running sometimes. This led to flaky init script tests, which were legitimately catching the issue. - https://github.com/browserbase/stagehand/actions/runs/22233062982/job/64336364420?pr=1580 <img width="1613" height="987" alt="image" src="https://github.com/user-attachments/assets/e836cd65-ed3b-41c8-8f8e-152fd70f30f4" /> # what changed - queues calls on page load to run before we resume - catches oopifs and lazy frames and click-triggered popups the same way playwright does - removes flaky timeout/retry based prior approach https://deepwiki.com/search/how-does-playwright-guarantee_8cf2339b-c060-4cfc-bc62-f3baaf57b229?mode=deep # test plan  --- ## Summary by cubic Fixes the init‑script race by guaranteeing pre‑resume setup and correcting popup attach order. Init scripts now run reliably in same‑ and cross‑process popups, OOPIF iframes, and across reloads; race tests verify addScript is sent before resume per session. - **Bug Fixes** - Enforce pre‑resume ordering: per‑session dispatch waiters ensure Page/Runtime enables, Target.setAutoAttach(waitForDebuggerOnStart), Network.enable/setExtraHTTPHeaders, and Page.addScriptToEvaluateOnNewDocument(runImmediately) are sent before Runtime.runIfWaitingForDebugger; resume only after dispatch; log ordering issues only for top‑level pages. - Stabilize attach and evaluation: fix popup attach ordering; fan out Target.* events to root listeners; retry Runtime.evaluate once on stale context ids; pre‑register the piercer script before resume and lazy‑install if needed. - Harden lifecycle: convert detach errors to PageNotFoundError and propagate; treat Page.enable/lifecycle acks as best‑effort; never drop top‑level Page.create due to local timeouts. - Expand tests and deflake: add delayed‑CDP‑send popup/iframe race repro with real URLs; assert addScript precedes resume per session; cover in‑process and cross‑process popups, window.open, OOPIF iframes, and reload persistence; update detach expectations and timeouts. Written for commit 6f464d3. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1719">Review in cubic</a>

# why - to allow for setting HTTP headers at the page level # what changed - added new function `page.setExtraHTTPHeaders()` , which sets HTTP headers for the CDP session of the page, and all of its child sessions (eg, iframes)  --- ## Summary by cubic Adds page.setExtraHTTPHeaders() to set per-page HTTP headers on all requests from the page and its iframes. Applies to pipeline sessions immediately and replays on newly adopted child sessions. Addresses ST LaurensG- NB: STG-1316. Written for commit cf677c2. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1774">Review in cubic</a>

) ## Summary - Adds support for CDP (Chrome DevTools Protocol) extra HTTP headers when connecting to browser sessions - Passes `extraHTTPHeaders` from the Stagehand config through to the CDP connection layer - Warns when `cdpHeaders` provided without `cdpUrl` - Includes integration test for the new functionality Related: #1737 Co-authored-by: aditya-silna <aditya@silnahealth.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: aditya-silna <aditya@silnahealth.com>

# why After the build migration, `pnpm build:cli` was no longer linking or preserving overriden configs # what changed - Added bin field in `package.json` to enable npm linking - Implemented smart config merging in the build script that updates tasks/benchmarks from source while preserving user-customized defaults - Added auto-linking via npm link --force at the end of the build process with graceful fallback, for whenever users run `pnpm build:cli` - Set `serverCache: false` in initV3 for consistent eval behavior on API # test plan --------- Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>

## Summary - Server integration tests, evals, and Stainless preview builds require repo secrets that GitHub doesn't expose to fork PRs - These jobs were running and failing with missing env var errors on every fork PR - Add the same fork guard (`head.repo.full_name == github.repository`) that e2e tests already use ### Jobs fixed: - `server-integration-tests` in `ci.yml` - `run-evals` in `ci.yml` - `preview` in `stainless.yml` ## Test plan - [ ] Verify existing (non-fork) PRs still run all CI jobs - [ ] Verify fork PRs skip the guarded jobs gracefully 🤖 Generated with [Claude Code](https://claude.com/claude-code)  --- ## Summary by cubic Skip CI jobs that require repo secrets on fork PRs to prevent missing env errors. These jobs now run only when the PR comes from this repo. - **Bug Fixes** - Guarded server integration tests in ci.yml. - Guarded eval runs in ci.yml. - Guarded Stainless preview builds in stainless.yml. Written for commit f71de8d. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1780">Review in cubic</a>

PSA potential hackers: dont get excited, we don't have any real secrets in CI worth stealing, and our CI does not autodeploy anything to prod. All important secrets and CD processes are kept in our closed-source repos. # why # what changed # test plan  --- ## Summary by cubic Add a gating workflow that blocks CI until a maintainer approves running secrets on forked PRs. CI now triggers from that gate, resolves labels and path filters under workflow_run, removes same-repo guards so integration/e2e/evals run on approved forks, and checks out the PR commit consistently across jobs. Written for commit c682847. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1782">Review in cubic</a>

…ed" (#1786) Reverts #1782  --- ## Summary by cubic Reverts the approval-based CI for external contributors. CI now runs on pull_request and blocks secrets for forked PRs by skipping integration, E2E, and eval jobs. - **Refactors** - Removed the “Ensure Contributor Is Trusted to Run CI” workflow. - Switched CI trigger to pull_request; removed workflow_run logic. - Read labels from github.event.pull_request; removed API calls. - Simplified checkouts; dropped explicit head_sha refs. - Updated concurrency group to use github.ref. - Ignored docs-only changes in CI. Written for commit d6ace82. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1786">Review in cubic</a>

Reverts #1780  --- ## Summary by cubic Reverts the change that skipped CI on forked PRs. Integration tests, evals, and the Stainless preview now run for all PRs by removing the head-repo equality checks in ci.yml and stainless.yml. Written for commit 18480e8. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1787">Review in cubic</a>

# why cdpHeaders is already plumbed through packages/server correctly, it was just missing from the spec. - packages/core/lib/v3/types/public/api.ts:15 defines cdpHeaders on LocalBrowserLaunchOptionsSchema. - packages/server/src/routes/v1/sessions/start.ts:192 forwards browser.launchOptions with a spread into localBrowserLaunchOptions, so cdpHeaders is preserved. - packages/server/src/lib/InMemorySessionStore.ts:240 passes localBrowserLaunchOptions straight into new V3(...). - packages/core/lib/v3/v3.ts:750 passes lbo.cdpHeaders into V3Context.create(...). - packages/core/lib/v3/understudy/context.ts:167 finally uses it in CdpConnection.connect(wsUrl, { headers: opts?.cdpHeaders }). # what changed # test plan  --- ## Summary by cubic Added the missing `cdpHeaders` field to the v3 server OpenAPI spec so clients can send custom Chrome DevTools Protocol headers. This aligns the spec with server launch options and prevents client codegen/validation errors. Written for commit 39ee737. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1797">Review in cubic</a>

…and server-v4 dirs (#1796) # Follow-up Tasks - [ ] Update stainless SDK custom code for all languages to pull new `stagehand-server-v3-darwin-x64` binary names (`-v3-` added)  --- ## Summary by cubic Split the Stagehand API into `packages/server-v3` and `packages/server-v4`, each with its own builds, tests, SEA binaries, and release workflows. Delivers STG-1536 and lets us keep v3 stable while iterating on v4; CI/test discovery and OpenAPI artifacts are versioned. - **Refactors** - Renamed the original server to `packages/server-v3` (`@browserbasehq/stagehand-server-v3`); updated docs and runtime path helpers (now synced across core/docs/evals and both servers), ESLint globs/ignores, scripts/Turbo filters, tests, and Stainless to read `packages/server-v3/openapi.v3.yaml`; v3 SEA binaries use `stagehand-server-v3-*`. - Added `packages/server-v4` (`@browserbasehq/stagehand-server-v4`) with `/v4/**` routes, SSE streaming via `x-stream-response`, LRU/TTL in-memory session store, health/readiness, logging/metrics, `openapi.v4.yaml` + generator, SEA tooling, and v4 integration tests. - CI: path filters, test discovery, and artifacts cover both versions; added `stagehand-server-v4-release.yml` and `stagehand-server-v4-sea-build.yml`; renamed v3 workflows; artifacts include `packages/server-v3/**` and `packages/server-v4/**` dists and OAS. - **Migration** - Replace `packages/server/**` refs with `packages/server-v3/**` or `packages/server-v4/**`. - Use new package filters and binary names: `@browserbasehq/stagehand-server-v3` / `@browserbasehq/stagehand-server-v4`; `stagehand-server-v3-*` / `stagehand-server-v4-*`. - Update OpenAPI consumers to `packages/server-v3/openapi.v3.yaml` or `packages/server-v4/openapi.v4.yaml`. Written for commit 2b9114c. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1796">Review in cubic</a>

## Summary - Adds the `@browserbasehq/browse-cli` package (`packages/cli`) to the stagehand monorepo, open-sourcing browser automation for AI agents - CLI provides stateful browser control via a daemon architecture — navigation, clicking, typing, screenshots, accessibility snapshots, multi-tab, network capture, and env switching (local/remote) - Uses `@browserbasehq/stagehand` as a workspace dependency (bundled into the CLI binary via tsup) - Includes full test suite and documentation ## Changes - `packages/cli/` — all CLI source code, config, tests, and docs - `pnpm-workspace.yaml` — added `packages/cli` to workspace - `.github/workflows/ci.yml` — added CLI path filters and build artifact uploads - `.changeset/open-source-browse-cli.md` — changeset for initial release - `pnpm-lock.yaml` — updated lockfile ## Test plan - [x] CLI builds successfully (`pnpm --filter @browserbasehq/browse-cli run build`) - [x] Full monorepo build passes (`turbo run build` — 9/9 tasks) - [x] `browse --help` and `browse --version` output correctly - [x] `browse status` returns valid JSON - [x] Lint passes clean (`pnpm --filter @browserbasehq/browse-cli run lint`) - [x] Source verified identical to stagent-cli (only import path changed) - [x] Empirically tested Browserbase credential requirements match core - [ ] Run `pnpm --filter @browserbasehq/browse-cli run test` (requires Chrome/browser environment) ## Known issues (pre-existing from stagent-cli, not introduced by this PR) - Network capture `response.json` always writes `status: 0` — response metadata from `responseReceived` CDP event is not persisted to `loadingFinished` handler - Ref-based `click` command silently ignores `--button`/`--count`/`--force` flags (coordinate-based `click_xy` handles them correctly) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…g CI (#1801) # why # what changed # test plan  --- ## Summary by cubic Corrects the changeset package reference from `@browserbasehq/stagehand-server` to `@browserbasehq/stagehand-server-v3` to unblock CI and ensure the correct package receives the patch release. Written for commit 177bc48. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1801">Review in cubic</a>

## Summary - `browse env` showed stale "local" mode after `browse env remote` - Root cause: `.mode` file was only written during lazy browser init (`ensureBrowserInitialized`), not at daemon startup. Between daemon start and first command, `readCurrentMode()` returned `null` and fell back to hardcoded `"local"` - Write `.mode` eagerly in `runDaemon()` at startup so it's immediately available - Fall back to `desiredMode` instead of `"local"` in the `env` display handler as a safety net ## Test plan - [x] Reproduced bug: `browse env remote` → `browse env` showed `"mode":"local"` - [x] Verified fix: `browse env remote` → `browse env` now shows `"mode":"remote"` - [x] `mode.test.ts` passes (3/3)  --- ## Summary by cubic Fixes `browse env` showing stale "local" after `browse env remote` (STG-1547). The daemon now writes `.mode` at startup, the display falls back to `desiredMode` until mode is written, and a patch changeset is added for `@browserbasehq/browse-cli`. Written for commit 9661d92. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1806">Review in cubic</a>  --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

## Summary - Stacked on #1800 - Only `BROWSERBASE_API_KEY` is required for remote mode in the CLI - `BROWSERBASE_PROJECT_ID` is still passed through if set, but no longer checked ## Changes - `packages/cli/src/index.ts` — `hasBrowserbaseCredentials()` only checks for API key - `packages/cli/tests/mode.test.ts` — Updated test to match new error message - `packages/cli/README.md` — Updated docs to reflect optional project ID ## Test plan - [x] Existing mode test updated - [x] Manual: `browse env remote` with only `BROWSERBASE_API_KEY` set 🤖 Generated with [Claude Code](https://claude.com/claude-code)  --- ## Summary by cubic Make `BROWSERBASE_PROJECT_ID` optional in the CLI for remote mode, so only `BROWSERBASE_API_KEY` is required. The project ID is still forwarded when provided. - **Bug Fixes** - Updated remote mode check and error message to only require `BROWSERBASE_API_KEY`. - Autodetection now defaults to `remote` when the API key is set; otherwise `local`. - Updated tests and `@browserbasehq/browse-cli` README to match. Written for commit 99eb186. Summary will update on new commits. <a href="https://cubic.dev/pr/browserbase/stagehand/pull/1803">Review in cubic</a>  Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

miguelg719 force-pushed the main branch from 4994eab to bd0a799 Compare October 29, 2025 16:15

tkattkat and others added 29 commits December 2, 2025 16:49

[fix]: dont attach to targets if already attached (#1346)

4e051b2

feat: enabling gpt 5.2 (#1403)

6255e4c

seanmcguire12 and others added 30 commits February 24, 2026 14:03

[STG-1458] server cache docs (#1753)

54ea8ba

[feat]: add configurable timeout to agent tools (#1766)

7817fcc

Make projectId optional for Browserbase sessions (#1800)

2abf5b9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync#2

Sync#2
metehanozdev wants to merge 717 commits intoemregucerr:mainfrom
browserbase:main

metehanozdev commented Aug 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

Conversation

metehanozdev commented Aug 9, 2025

why

what changed

test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants