Skip to content

[Feature Request] browser commands: cross-origin iframe support #1077

@small1491124-glitch

Description

@small1491124-glitch

Summary

opencli browser commands (state, click, type, eval) currently cannot interact with elements inside <iframe> elements, particularly cross-origin iframes. The state command already detects iframes (shown as |iframe|<iframe />), but cannot enumerate or interact with their contents.

Motivation

Many websites embed their core interactive UI inside cross-origin iframes. Common real-world examples include:

  • Embedded Google Forms on third-party sites — form fields live inside a docs.google.com iframe
  • CodePen / JSFiddle embeds — the preview and editor are in cross-origin iframes
  • OAuth login flows — "Sign in with Google/GitHub" popups embed cross-origin auth pages
  • Payment gateways — Stripe Elements, PayPal buttons render inside isolated iframes
  • WYSIWYG editors — TinyMCE, CKEditor use iframes for the editing surface

Currently, opencli browser cannot automate any of these use cases.

In contrast, Playwright CLI (@playwright/cli) handles this seamlessly via CDP (Chrome DevTools Protocol), which operates below the browser's security sandbox:

// Playwright CLI auto-generates this when interacting with iframe elements:
await page.locator('#my-iframe')
  .contentFrame()
  .getByRole('button', { name: 'Submit' })
  .click();

Current behavior

$ opencli browser state
# ... top-level elements [0]-[108] ...
|iframe|<iframe />
---
interactive: 108 | iframes: 1    # ← iframe detected but contents inaccessible

$ opencli browser eval "document.querySelector('iframe').contentDocument.querySelector('input')"
# ← fails silently (cross-origin SecurityError)

Proposed solution

Option A: all_frames content script injection (recommended)

Update the Browser Bridge extension's manifest.json:

"content_scripts": [{
  "matches": ["<all_urls>"],
  "all_frames": true,
  "js": ["content-script.js"]
}]

Then extend the commands to support frame-prefixed indices:

# state: expand iframe contents with frame prefix
$ opencli browser state
  |iframe| [F0]
    [F0:0]<input type=text placeholder=Name />
    [F0:1]<button>Submit</button>

# click/type/eval with frame index
$ opencli browser click F0:1
$ opencli browser type F0:0 "hello"
$ opencli browser eval "document.querySelector('input').value" --frame 0

Option B: Dedicated browser frame subcommand

$ opencli browser frame list              # list all iframes
$ opencli browser frame 0 state           # state of iframe 0
$ opencli browser frame 0 click 1         # click inside iframe 0
$ opencli browser frame 0 eval "..."      # eval JS inside iframe 0

Option C: chrome.debugger API

Use chrome.debugger in the extension for full CDP-level frame access, identical to Playwright. More powerful but requires debugger permission grant.

Why this matters for AI agents

  1. Coverage gap: Many sites use iframes for critical UI. Without iframe support, agents cannot automate these workflows.
  2. Competitive parity: Playwright CLI and Playwright MCP both handle iframes natively.
  3. Existing detection: opencli browser state already detects and counts iframes (iframes: 1), so the infrastructure is partially there.

Environment

  • OpenCLI: v1.7.4
  • OS: Windows 11
  • Browser Bridge Extension: v1.5.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions