diff --git a/.agents/skills/agent-browser/SKILL.md b/.agents/skills/agent-browser/SKILL.md new file mode 100644 index 0000000..00f97aa --- /dev/null +++ b/.agents/skills/agent-browser/SKILL.md @@ -0,0 +1,750 @@ +--- +name: agent-browser +description: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. +allowed-tools: Bash(npx agent-browser:*), Bash(agent-browser:*) +--- + +# Browser Automation with agent-browser + +The CLI uses Chrome/Chromium via CDP directly. Install via `npm i -g agent-browser`, `brew install agent-browser`, or `cargo install agent-browser`. Run `agent-browser install` to download Chrome. Run `agent-browser upgrade` to update to the latest version. + +## Core Workflow + +Every browser automation follows this pattern: + +1. **Navigate**: `agent-browser open ` +2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`) +3. **Interact**: Use refs to click, fill, select +4. **Re-snapshot**: After navigation or DOM changes, get fresh refs + +```bash +agent-browser open https://example.com/form +agent-browser snapshot -i +# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit" + +agent-browser fill @e1 "user@example.com" +agent-browser fill @e2 "password123" +agent-browser click @e3 +agent-browser wait --load networkidle +agent-browser snapshot -i # Check result +``` + +## Command Chaining + +Commands can be chained with `&&` in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls. + +```bash +# Chain open + wait + snapshot in one call +agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i + +# Chain multiple interactions +agent-browser fill @e1 "user@example.com" && agent-browser fill @e2 "password123" && agent-browser click @e3 + +# Navigate and capture +agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png +``` + +**When to chain:** Use `&&` when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs). + +## Handling Authentication + +When automating a site that requires login, choose the approach that fits: + +**Option 1: Import auth from the user's browser (fastest for one-off tasks)** + +```bash +# Connect to the user's running Chrome (they're already logged in) +agent-browser --auto-connect state save ./auth.json +# Use that auth state +agent-browser --state ./auth.json open https://app.example.com/dashboard +``` + +State files contain session tokens in plaintext -- add to `.gitignore` and delete when no longer needed. Set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. + +**Option 2: Persistent profile (simplest for recurring tasks)** + +```bash +# First run: login manually or via automation +agent-browser --profile ~/.myapp open https://app.example.com/login +# ... fill credentials, submit ... + +# All future runs: already authenticated +agent-browser --profile ~/.myapp open https://app.example.com/dashboard +``` + +**Option 3: Session name (auto-save/restore cookies + localStorage)** + +```bash +agent-browser --session-name myapp open https://app.example.com/login +# ... login flow ... +agent-browser close # State auto-saved + +# Next time: state auto-restored +agent-browser --session-name myapp open https://app.example.com/dashboard +``` + +**Option 4: Auth vault (credentials stored encrypted, login by name)** + +```bash +echo "$PASSWORD" | agent-browser auth save myapp --url https://app.example.com/login --username user --password-stdin +agent-browser auth login myapp +``` + +`auth login` navigates with `load` and then waits for login form selectors to appear before filling/clicking, which is more reliable on delayed SPA login screens. + +**Option 5: State file (manual save/load)** + +```bash +# After logging in: +agent-browser state save ./auth.json +# In a future session: +agent-browser state load ./auth.json +agent-browser open https://app.example.com/dashboard +``` + +See [references/authentication.md](references/authentication.md) for OAuth, 2FA, cookie-based auth, and token refresh patterns. + +## Essential Commands + +```bash +# Navigation +agent-browser open # Navigate (aliases: goto, navigate) +agent-browser close # Close browser +agent-browser close --all # Close all active sessions + +# Snapshot +agent-browser snapshot -i # Interactive elements with refs (recommended) +agent-browser snapshot -s "#selector" # Scope to CSS selector + +# Interaction (use @refs from snapshot) +agent-browser click @e1 # Click element +agent-browser click @e1 --new-tab # Click and open in new tab +agent-browser fill @e2 "text" # Clear and type text +agent-browser type @e2 "text" # Type without clearing +agent-browser select @e1 "option" # Select dropdown option +agent-browser check @e1 # Check checkbox +agent-browser press Enter # Press key +agent-browser keyboard type "text" # Type at current focus (no selector) +agent-browser keyboard inserttext "text" # Insert without key events +agent-browser scroll down 500 # Scroll page +agent-browser scroll down 500 --selector "div.content" # Scroll within a specific container + +# Get information +agent-browser get text @e1 # Get element text +agent-browser get url # Get current URL +agent-browser get title # Get page title +agent-browser get cdp-url # Get CDP WebSocket URL + +# Wait +agent-browser wait @e1 # Wait for element +agent-browser wait --load networkidle # Wait for network idle +agent-browser wait --url "**/page" # Wait for URL pattern +agent-browser wait 2000 # Wait milliseconds +agent-browser wait --text "Welcome" # Wait for text to appear (substring match) +agent-browser wait --fn "!document.body.innerText.includes('Loading...')" # Wait for text to disappear +agent-browser wait "#spinner" --state hidden # Wait for element to disappear + +# Downloads +agent-browser download @e1 ./file.pdf # Click element to trigger download +agent-browser wait --download ./output.zip # Wait for any download to complete +agent-browser --download-path ./downloads open # Set default download directory + +# Network +agent-browser network requests # Inspect tracked requests +agent-browser network requests --type xhr,fetch # Filter by resource type +agent-browser network requests --method POST # Filter by HTTP method +agent-browser network requests --status 2xx # Filter by status (200, 2xx, 400-499) +agent-browser network request # View full request/response detail +agent-browser network route "**/api/*" --abort # Block matching requests +agent-browser network har start # Start HAR recording +agent-browser network har stop ./capture.har # Stop and save HAR file + +# Viewport & Device Emulation +agent-browser set viewport 1920 1080 # Set viewport size (default: 1280x720) +agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots) +agent-browser set device "iPhone 14" # Emulate device (viewport + user agent) + +# Capture +agent-browser screenshot # Screenshot to temp dir +agent-browser screenshot --full # Full page screenshot +agent-browser screenshot --annotate # Annotated screenshot with numbered element labels +agent-browser screenshot --screenshot-dir ./shots # Save to custom directory +agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80 +agent-browser pdf output.pdf # Save as PDF + +# Live preview / streaming +agent-browser stream enable # Start runtime WebSocket streaming on an auto-selected port +agent-browser stream enable --port 9223 # Bind a specific localhost port +agent-browser stream status # Inspect enabled state, port, connection, and screencasting +agent-browser stream disable # Stop runtime streaming and remove the .stream metadata file + +# Clipboard +agent-browser clipboard read # Read text from clipboard +agent-browser clipboard write "Hello, World!" # Write text to clipboard +agent-browser clipboard copy # Copy current selection +agent-browser clipboard paste # Paste from clipboard + +# Dialogs (alert, confirm, prompt, beforeunload) +# By default, alert and beforeunload dialogs are auto-accepted so they never block the agent. +# confirm and prompt dialogs still require explicit handling. +# Use --no-auto-dialog to disable automatic handling. +agent-browser dialog accept # Accept dialog +agent-browser dialog accept "my input" # Accept prompt dialog with text +agent-browser dialog dismiss # Dismiss/cancel dialog +agent-browser dialog status # Check if a dialog is currently open + +# Diff (compare page states) +agent-browser diff snapshot # Compare current vs last snapshot +agent-browser diff snapshot --baseline before.txt # Compare current vs saved file +agent-browser diff screenshot --baseline before.png # Visual pixel diff +agent-browser diff url # Compare two pages +agent-browser diff url --wait-until networkidle # Custom wait strategy +agent-browser diff url --selector "#main" # Scope to element +``` + +## Streaming + +Every session automatically starts a WebSocket stream server on an OS-assigned port. Use `agent-browser stream status` to see the bound port and connection state. Use `stream disable` to tear it down, and `stream enable --port ` to re-enable on a specific port. + +## Batch Execution + +Execute multiple commands in a single invocation by piping a JSON array of string arrays to `batch`. This avoids per-command process startup overhead when running multi-step workflows. + +```bash +echo '[ + ["open", "https://example.com"], + ["snapshot", "-i"], + ["click", "@e1"], + ["screenshot", "result.png"] +]' | agent-browser batch --json + +# Stop on first error +agent-browser batch --bail < commands.json +``` + +Use `batch` when you have a known sequence of commands that don't depend on intermediate output. Use separate commands or `&&` chaining when you need to parse output between steps (e.g., snapshot to discover refs, then interact). + +## Common Patterns + +### Form Submission + +```bash +agent-browser open https://example.com/signup +agent-browser snapshot -i +agent-browser fill @e1 "Jane Doe" +agent-browser fill @e2 "jane@example.com" +agent-browser select @e3 "California" +agent-browser check @e4 +agent-browser click @e5 +agent-browser wait --load networkidle +``` + +### Authentication with Auth Vault (Recommended) + +```bash +# Save credentials once (encrypted with AGENT_BROWSER_ENCRYPTION_KEY) +# Recommended: pipe password via stdin to avoid shell history exposure +echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin + +# Login using saved profile (LLM never sees password) +agent-browser auth login github + +# List/show/delete profiles +agent-browser auth list +agent-browser auth show github +agent-browser auth delete github +``` + +`auth login` waits for username/password/submit selectors before interacting, with a timeout tied to the default action timeout. + +### Authentication with State Persistence + +```bash +# Login once and save state +agent-browser open https://app.example.com/login +agent-browser snapshot -i +agent-browser fill @e1 "$USERNAME" +agent-browser fill @e2 "$PASSWORD" +agent-browser click @e3 +agent-browser wait --url "**/dashboard" +agent-browser state save auth.json + +# Reuse in future sessions +agent-browser state load auth.json +agent-browser open https://app.example.com/dashboard +``` + +### Session Persistence + +```bash +# Auto-save/restore cookies and localStorage across browser restarts +agent-browser --session-name myapp open https://app.example.com/login +# ... login flow ... +agent-browser close # State auto-saved to ~/.agent-browser/sessions/ + +# Next time, state is auto-loaded +agent-browser --session-name myapp open https://app.example.com/dashboard + +# Encrypt state at rest +export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32) +agent-browser --session-name secure open https://app.example.com + +# Manage saved states +agent-browser state list +agent-browser state show myapp-default.json +agent-browser state clear myapp +agent-browser state clean --older-than 7 +``` + +### Working with Iframes + +Iframe content is automatically inlined in snapshots. Refs inside iframes carry frame context, so you can interact with them directly. + +```bash +agent-browser open https://example.com/checkout +agent-browser snapshot -i +# @e1 [heading] "Checkout" +# @e2 [Iframe] "payment-frame" +# @e3 [input] "Card number" +# @e4 [input] "Expiry" +# @e5 [button] "Pay" + +# Interact directly — no frame switch needed +agent-browser fill @e3 "4111111111111111" +agent-browser fill @e4 "12/28" +agent-browser click @e5 + +# To scope a snapshot to one iframe: +agent-browser frame @e2 +agent-browser snapshot -i # Only iframe content +agent-browser frame main # Return to main frame +``` + +### Data Extraction + +```bash +agent-browser open https://example.com/products +agent-browser snapshot -i +agent-browser get text @e5 # Get specific element text +agent-browser get text body > page.txt # Get all page text + +# JSON output for parsing +agent-browser snapshot -i --json +agent-browser get text @e1 --json +``` + +### Parallel Sessions + +```bash +agent-browser --session site1 open https://site-a.com +agent-browser --session site2 open https://site-b.com + +agent-browser --session site1 snapshot -i +agent-browser --session site2 snapshot -i + +agent-browser session list +``` + +### Connect to Existing Chrome + +```bash +# Auto-discover running Chrome with remote debugging enabled +agent-browser --auto-connect open https://example.com +agent-browser --auto-connect snapshot + +# Or with explicit CDP port +agent-browser --cdp 9222 snapshot +``` + +Auto-connect discovers Chrome via `DevToolsActivePort`, common debugging ports (9222, 9229), and falls back to a direct WebSocket connection if HTTP-based CDP discovery fails. + +### Color Scheme (Dark Mode) + +```bash +# Persistent dark mode via flag (applies to all pages and new tabs) +agent-browser --color-scheme dark open https://example.com + +# Or via environment variable +AGENT_BROWSER_COLOR_SCHEME=dark agent-browser open https://example.com + +# Or set during session (persists for subsequent commands) +agent-browser set media dark +``` + +### Viewport & Responsive Testing + +```bash +# Set a custom viewport size (default is 1280x720) +agent-browser set viewport 1920 1080 +agent-browser screenshot desktop.png + +# Test mobile-width layout +agent-browser set viewport 375 812 +agent-browser screenshot mobile.png + +# Retina/HiDPI: same CSS layout at 2x pixel density +# Screenshots stay at logical viewport size, but content renders at higher DPI +agent-browser set viewport 1920 1080 2 +agent-browser screenshot retina.png + +# Device emulation (sets viewport + user agent in one step) +agent-browser set device "iPhone 14" +agent-browser screenshot device.png +``` + +The `scale` parameter (3rd argument) sets `window.devicePixelRatio` without changing CSS layout. Use it when testing retina rendering or capturing higher-resolution screenshots. + +### Visual Browser (Debugging) + +```bash +agent-browser --headed open https://example.com +agent-browser highlight @e1 # Highlight element +agent-browser inspect # Open Chrome DevTools for the active page +agent-browser record start demo.webm # Record session +agent-browser profiler start # Start Chrome DevTools profiling +agent-browser profiler stop trace.json # Stop and save profile (path optional) +``` + +Use `AGENT_BROWSER_HEADED=1` to enable headed mode via environment variable. Browser extensions work in both headed and headless mode. + +### Local Files (PDFs, HTML) + +```bash +# Open local files with file:// URLs +agent-browser --allow-file-access open file:///path/to/document.pdf +agent-browser --allow-file-access open file:///path/to/page.html +agent-browser screenshot output.png +``` + +### iOS Simulator (Mobile Safari) + +```bash +# List available iOS simulators +agent-browser device list + +# Launch Safari on a specific device +agent-browser -p ios --device "iPhone 16 Pro" open https://example.com + +# Same workflow as desktop - snapshot, interact, re-snapshot +agent-browser -p ios snapshot -i +agent-browser -p ios tap @e1 # Tap (alias for click) +agent-browser -p ios fill @e2 "text" +agent-browser -p ios swipe up # Mobile-specific gesture + +# Take screenshot +agent-browser -p ios screenshot mobile.png + +# Close session (shuts down simulator) +agent-browser -p ios close +``` + +**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`) + +**Real devices:** Works with physical iOS devices if pre-configured. Use `--device ""` where UDID is from `xcrun xctrace list devices`. + +## Security + +All security features are opt-in. By default, agent-browser imposes no restrictions on navigation, actions, or output. + +### Content Boundaries (Recommended for AI Agents) + +Enable `--content-boundaries` to wrap page-sourced output in markers that help LLMs distinguish tool output from untrusted page content: + +```bash +export AGENT_BROWSER_CONTENT_BOUNDARIES=1 +agent-browser snapshot +# Output: +# --- AGENT_BROWSER_PAGE_CONTENT nonce= origin=https://example.com --- +# [accessibility tree] +# --- END_AGENT_BROWSER_PAGE_CONTENT nonce= --- +``` + +### Domain Allowlist + +Restrict navigation to trusted domains. Wildcards like `*.example.com` also match the bare domain `example.com`. Sub-resource requests, WebSocket, and EventSource connections to non-allowed domains are also blocked. Include CDN domains your target pages depend on: + +```bash +export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com" +agent-browser open https://example.com # OK +agent-browser open https://malicious.com # Blocked +``` + +### Action Policy + +Use a policy file to gate destructive actions: + +```bash +export AGENT_BROWSER_ACTION_POLICY=./policy.json +``` + +Example `policy.json`: + +```json +{ "default": "deny", "allow": ["navigate", "snapshot", "click", "scroll", "wait", "get"] } +``` + +Auth vault operations (`auth login`, etc.) bypass action policy but domain allowlist still applies. + +### Output Limits + +Prevent context flooding from large pages: + +```bash +export AGENT_BROWSER_MAX_OUTPUT=50000 +``` + +## Diffing (Verifying Changes) + +Use `diff snapshot` after performing an action to verify it had the intended effect. This compares the current accessibility tree against the last snapshot taken in the session. + +```bash +# Typical workflow: snapshot -> action -> diff +agent-browser snapshot -i # Take baseline snapshot +agent-browser click @e2 # Perform action +agent-browser diff snapshot # See what changed (auto-compares to last snapshot) +``` + +For visual regression testing or monitoring: + +```bash +# Save a baseline screenshot, then compare later +agent-browser screenshot baseline.png +# ... time passes or changes are made ... +agent-browser diff screenshot --baseline baseline.png + +# Compare staging vs production +agent-browser diff url https://staging.example.com https://prod.example.com --screenshot +``` + +`diff snapshot` output uses `+` for additions and `-` for removals, similar to git diff. `diff screenshot` produces a diff image with changed pixels highlighted in red, plus a mismatch percentage. + +## Timeouts and Slow Pages + +The default timeout is 25 seconds. This can be overridden with the `AGENT_BROWSER_DEFAULT_TIMEOUT` environment variable (value in milliseconds). For slow websites or large pages, use explicit waits instead of relying on the default timeout: + +```bash +# Wait for network activity to settle (best for slow pages) +agent-browser wait --load networkidle + +# Wait for a specific element to appear +agent-browser wait "#content" +agent-browser wait @e1 + +# Wait for a specific URL pattern (useful after redirects) +agent-browser wait --url "**/dashboard" + +# Wait for a JavaScript condition +agent-browser wait --fn "document.readyState === 'complete'" + +# Wait a fixed duration (milliseconds) as a last resort +agent-browser wait 5000 +``` + +When dealing with consistently slow websites, use `wait --load networkidle` after `open` to ensure the page is fully loaded before taking a snapshot. If a specific element is slow to render, wait for it directly with `wait ` or `wait @ref`. + +## JavaScript Dialogs (alert / confirm / prompt) + +When a page opens a JavaScript dialog (`alert()`, `confirm()`, or `prompt()`), it blocks all other browser commands (snapshot, screenshot, click, etc.) until the dialog is dismissed. If commands start timing out unexpectedly, check for a pending dialog: + +```bash +# Check if a dialog is blocking +agent-browser dialog status + +# Accept the dialog (dismiss the alert / click OK) +agent-browser dialog accept + +# Accept a prompt dialog with input text +agent-browser dialog accept "my input" + +# Dismiss the dialog (click Cancel) +agent-browser dialog dismiss +``` + +When a dialog is pending, all command responses include a `warning` field indicating the dialog type and message. In `--json` mode this appears as a `"warning"` key in the response object. + +## Session Management and Cleanup + +When running multiple agents or automations concurrently, always use named sessions to avoid conflicts: + +```bash +# Each agent gets its own isolated session +agent-browser --session agent1 open site-a.com +agent-browser --session agent2 open site-b.com + +# Check active sessions +agent-browser session list +``` + +Always close your browser session when done to avoid leaked processes: + +```bash +agent-browser close # Close default session +agent-browser --session agent1 close # Close specific session +agent-browser close --all # Close all active sessions +``` + +If a previous session was not closed properly, the daemon may still be running. Use `agent-browser close` to clean it up, or `agent-browser close --all` to shut down every session at once. + +To auto-shutdown the daemon after a period of inactivity (useful for ephemeral/CI environments): + +```bash +AGENT_BROWSER_IDLE_TIMEOUT_MS=60000 agent-browser open example.com +``` + +## Ref Lifecycle (Important) + +Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after: + +- Clicking links or buttons that navigate +- Form submissions +- Dynamic content loading (dropdowns, modals) + +```bash +agent-browser click @e5 # Navigates to new page +agent-browser snapshot -i # MUST re-snapshot +agent-browser click @e1 # Use new refs +``` + +## Annotated Screenshots (Vision Mode) + +Use `--annotate` to take a screenshot with numbered labels overlaid on interactive elements. Each label `[N]` maps to ref `@eN`. This also caches refs, so you can interact with elements immediately without a separate snapshot. + +```bash +agent-browser screenshot --annotate +# Output includes the image path and a legend: +# [1] @e1 button "Submit" +# [2] @e2 link "Home" +# [3] @e3 textbox "Email" +agent-browser click @e2 # Click using ref from annotated screenshot +``` + +Use annotated screenshots when: + +- The page has unlabeled icon buttons or visual-only elements +- You need to verify visual layout or styling +- Canvas or chart elements are present (invisible to text snapshots) +- You need spatial reasoning about element positions + +## Semantic Locators (Alternative to Refs) + +When refs are unavailable or unreliable, use semantic locators: + +```bash +agent-browser find text "Sign In" click +agent-browser find label "Email" fill "user@test.com" +agent-browser find role button click --name "Submit" +agent-browser find placeholder "Search" type "query" +agent-browser find testid "submit-btn" click +``` + +## JavaScript Evaluation (eval) + +Use `eval` to run JavaScript in the browser context. **Shell quoting can corrupt complex expressions** -- use `--stdin` or `-b` to avoid issues. + +```bash +# Simple expressions work with regular quoting +agent-browser eval 'document.title' +agent-browser eval 'document.querySelectorAll("img").length' + +# Complex JS: use --stdin with heredoc (RECOMMENDED) +agent-browser eval --stdin <<'EVALEOF' +JSON.stringify( + Array.from(document.querySelectorAll("img")) + .filter(i => !i.alt) + .map(i => ({ src: i.src.split("/").pop(), width: i.width })) +) +EVALEOF + +# Alternative: base64 encoding (avoids all shell escaping issues) +agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map(a => a.href)' | base64)" +``` + +**Why this matters:** When the shell processes your command, inner double quotes, `!` characters (history expansion), backticks, and `$()` can all corrupt the JavaScript before it reaches agent-browser. The `--stdin` and `-b` flags bypass shell interpretation entirely. + +**Rules of thumb:** + +- Single-line, no nested quotes -> regular `eval 'expression'` with single quotes is fine +- Nested quotes, arrow functions, template literals, or multiline -> use `eval --stdin <<'EVALEOF'` +- Programmatic/generated scripts -> use `eval -b` with base64 + +## Configuration File + +Create `agent-browser.json` in the project root for persistent settings: + +```json +{ + "headed": true, + "proxy": "http://localhost:8080", + "profile": "./browser-data" +} +``` + +Priority (lowest to highest): `~/.agent-browser/config.json` < `./agent-browser.json` < env vars < CLI flags. Use `--config ` or `AGENT_BROWSER_CONFIG` env var for a custom config file (exits with error if missing/invalid). All CLI options map to camelCase keys (e.g., `--executable-path` -> `"executablePath"`). Boolean flags accept `true`/`false` values (e.g., `--headed false` overrides config). Extensions from user and project configs are merged, not replaced. + +## Deep-Dive Documentation + +| Reference | When to Use | +| -------------------------------------------------------------------- | --------------------------------------------------------- | +| [references/commands.md](references/commands.md) | Full command reference with all options | +| [references/snapshot-refs.md](references/snapshot-refs.md) | Ref lifecycle, invalidation rules, troubleshooting | +| [references/session-management.md](references/session-management.md) | Parallel sessions, state persistence, concurrent scraping | +| [references/authentication.md](references/authentication.md) | Login flows, OAuth, 2FA handling, state reuse | +| [references/video-recording.md](references/video-recording.md) | Recording workflows for debugging and documentation | +| [references/profiling.md](references/profiling.md) | Chrome DevTools profiling for performance analysis | +| [references/proxy-support.md](references/proxy-support.md) | Proxy configuration, geo-testing, rotating proxies | + +## Browser Engine Selection + +Use `--engine` to choose a local browser engine. The default is `chrome`. + +```bash +# Use Lightpanda (fast headless browser, requires separate install) +agent-browser --engine lightpanda open example.com + +# Via environment variable +export AGENT_BROWSER_ENGINE=lightpanda +agent-browser open example.com + +# With custom binary path +agent-browser --engine lightpanda --executable-path /path/to/lightpanda open example.com +``` + +Supported engines: +- `chrome` (default) -- Chrome/Chromium via CDP +- `lightpanda` -- Lightpanda headless browser via CDP (10x faster, 10x less memory than Chrome) + +Lightpanda does not support `--extension`, `--profile`, `--state`, or `--allow-file-access`. Install Lightpanda from https://lightpanda.io/docs/open-source/installation. + +## Observability Dashboard + +The dashboard is a standalone background server that shows live browser viewports, command activity, and console output for all sessions. + +```bash +# Install the dashboard once +agent-browser dashboard install + +# Start the dashboard server (background, port 4848) +agent-browser dashboard start + +# All sessions are automatically visible in the dashboard +agent-browser open example.com + +# Stop the dashboard +agent-browser dashboard stop +``` + +The dashboard runs independently of browser sessions on port 4848 (configurable with `--port`). All sessions automatically stream to the dashboard. + +## Ready-to-Use Templates + +| Template | Description | +| ------------------------------------------------------------------------ | ----------------------------------- | +| [templates/form-automation.sh](templates/form-automation.sh) | Form filling with validation | +| [templates/authenticated-session.sh](templates/authenticated-session.sh) | Login once, reuse state | +| [templates/capture-workflow.sh](templates/capture-workflow.sh) | Content extraction with screenshots | + +```bash +./templates/form-automation.sh https://example.com/form +./templates/authenticated-session.sh https://app.example.com/login +./templates/capture-workflow.sh https://example.com ./output +``` diff --git a/.agents/skills/agent-browser/references/authentication.md b/.agents/skills/agent-browser/references/authentication.md new file mode 100644 index 0000000..89f4788 --- /dev/null +++ b/.agents/skills/agent-browser/references/authentication.md @@ -0,0 +1,303 @@ +# Authentication Patterns + +Login flows, session persistence, OAuth, 2FA, and authenticated browsing. + +**Related**: [session-management.md](session-management.md) for state persistence details, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Import Auth from Your Browser](#import-auth-from-your-browser) +- [Persistent Profiles](#persistent-profiles) +- [Session Persistence](#session-persistence) +- [Basic Login Flow](#basic-login-flow) +- [Saving Authentication State](#saving-authentication-state) +- [Restoring Authentication](#restoring-authentication) +- [OAuth / SSO Flows](#oauth--sso-flows) +- [Two-Factor Authentication](#two-factor-authentication) +- [HTTP Basic Auth](#http-basic-auth) +- [Cookie-Based Auth](#cookie-based-auth) +- [Token Refresh Handling](#token-refresh-handling) +- [Security Best Practices](#security-best-practices) + +## Import Auth from Your Browser + +The fastest way to authenticate is to reuse cookies from a Chrome session you are already logged into. + +**Step 1: Start Chrome with remote debugging** + +```bash +# macOS +"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --remote-debugging-port=9222 + +# Linux +google-chrome --remote-debugging-port=9222 + +# Windows +"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 +``` + +Log in to your target site(s) in this Chrome window as you normally would. + +> **Security note:** `--remote-debugging-port` exposes full browser control on localhost. Any local process can connect and read cookies, execute JS, etc. Only use on trusted machines and close Chrome when done. + +**Step 2: Grab the auth state** + +```bash +# Auto-discover the running Chrome and save its cookies + localStorage +agent-browser --auto-connect state save ./my-auth.json +``` + +**Step 3: Reuse in automation** + +```bash +# Load auth at launch +agent-browser --state ./my-auth.json open https://app.example.com/dashboard + +# Or load into an existing session +agent-browser state load ./my-auth.json +agent-browser open https://app.example.com/dashboard +``` + +This works for any site, including those with complex OAuth flows, SSO, or 2FA -- as long as Chrome already has valid session cookies. + +> **Security note:** State files contain session tokens in plaintext. Add them to `.gitignore`, delete when no longer needed, and set `AGENT_BROWSER_ENCRYPTION_KEY` for encryption at rest. See [Security Best Practices](#security-best-practices). + +**Tip:** Combine with `--session-name` so the imported auth auto-persists across restarts: + +```bash +agent-browser --session-name myapp state load ./my-auth.json +# From now on, state is auto-saved/restored for "myapp" +``` + +## Persistent Profiles + +Use `--profile` to point agent-browser at a Chrome user data directory. This persists everything (cookies, IndexedDB, service workers, cache) across browser restarts without explicit save/load: + +```bash +# First run: login once +agent-browser --profile ~/.myapp-profile open https://app.example.com/login +# ... complete login flow ... + +# All subsequent runs: already authenticated +agent-browser --profile ~/.myapp-profile open https://app.example.com/dashboard +``` + +Use different paths for different projects or test users: + +```bash +agent-browser --profile ~/.profiles/admin open https://app.example.com +agent-browser --profile ~/.profiles/viewer open https://app.example.com +``` + +Or set via environment variable: + +```bash +export AGENT_BROWSER_PROFILE=~/.myapp-profile +agent-browser open https://app.example.com/dashboard +``` + +## Session Persistence + +Use `--session-name` to auto-save and restore cookies + localStorage by name, without managing files: + +```bash +# Auto-saves state on close, auto-restores on next launch +agent-browser --session-name twitter open https://twitter.com +# ... login flow ... +agent-browser close # state saved to ~/.agent-browser/sessions/ + +# Next time: state is automatically restored +agent-browser --session-name twitter open https://twitter.com +``` + +Encrypt state at rest: + +```bash +export AGENT_BROWSER_ENCRYPTION_KEY=$(openssl rand -hex 32) +agent-browser --session-name secure open https://app.example.com +``` + +## Basic Login Flow + +```bash +# Navigate to login page +agent-browser open https://app.example.com/login +agent-browser wait --load networkidle + +# Get form elements +agent-browser snapshot -i +# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Sign In" + +# Fill credentials +agent-browser fill @e1 "user@example.com" +agent-browser fill @e2 "password123" + +# Submit +agent-browser click @e3 +agent-browser wait --load networkidle + +# Verify login succeeded +agent-browser get url # Should be dashboard, not login +``` + +## Saving Authentication State + +After logging in, save state for reuse: + +```bash +# Login first (see above) +agent-browser open https://app.example.com/login +agent-browser snapshot -i +agent-browser fill @e1 "user@example.com" +agent-browser fill @e2 "password123" +agent-browser click @e3 +agent-browser wait --url "**/dashboard" + +# Save authenticated state +agent-browser state save ./auth-state.json +``` + +## Restoring Authentication + +Skip login by loading saved state: + +```bash +# Load saved auth state +agent-browser state load ./auth-state.json + +# Navigate directly to protected page +agent-browser open https://app.example.com/dashboard + +# Verify authenticated +agent-browser snapshot -i +``` + +## OAuth / SSO Flows + +For OAuth redirects: + +```bash +# Start OAuth flow +agent-browser open https://app.example.com/auth/google + +# Handle redirects automatically +agent-browser wait --url "**/accounts.google.com**" +agent-browser snapshot -i + +# Fill Google credentials +agent-browser fill @e1 "user@gmail.com" +agent-browser click @e2 # Next button +agent-browser wait 2000 +agent-browser snapshot -i +agent-browser fill @e3 "password" +agent-browser click @e4 # Sign in + +# Wait for redirect back +agent-browser wait --url "**/app.example.com**" +agent-browser state save ./oauth-state.json +``` + +## Two-Factor Authentication + +Handle 2FA with manual intervention: + +```bash +# Login with credentials +agent-browser open https://app.example.com/login --headed # Show browser +agent-browser snapshot -i +agent-browser fill @e1 "user@example.com" +agent-browser fill @e2 "password123" +agent-browser click @e3 + +# Wait for user to complete 2FA manually +echo "Complete 2FA in the browser window..." +agent-browser wait --url "**/dashboard" --timeout 120000 + +# Save state after 2FA +agent-browser state save ./2fa-state.json +``` + +## HTTP Basic Auth + +For sites using HTTP Basic Authentication: + +```bash +# Set credentials before navigation +agent-browser set credentials username password + +# Navigate to protected resource +agent-browser open https://protected.example.com/api +``` + +## Cookie-Based Auth + +Manually set authentication cookies: + +```bash +# Set auth cookie +agent-browser cookies set session_token "abc123xyz" + +# Navigate to protected page +agent-browser open https://app.example.com/dashboard +``` + +## Token Refresh Handling + +For sessions with expiring tokens: + +```bash +#!/bin/bash +# Wrapper that handles token refresh + +STATE_FILE="./auth-state.json" + +# Try loading existing state +if [[ -f "$STATE_FILE" ]]; then + agent-browser state load "$STATE_FILE" + agent-browser open https://app.example.com/dashboard + + # Check if session is still valid + URL=$(agent-browser get url) + if [[ "$URL" == *"/login"* ]]; then + echo "Session expired, re-authenticating..." + # Perform fresh login + agent-browser snapshot -i + agent-browser fill @e1 "$USERNAME" + agent-browser fill @e2 "$PASSWORD" + agent-browser click @e3 + agent-browser wait --url "**/dashboard" + agent-browser state save "$STATE_FILE" + fi +else + # First-time login + agent-browser open https://app.example.com/login + # ... login flow ... +fi +``` + +## Security Best Practices + +1. **Never commit state files** - They contain session tokens + ```bash + echo "*.auth-state.json" >> .gitignore + ``` + +2. **Use environment variables for credentials** + ```bash + agent-browser fill @e1 "$APP_USERNAME" + agent-browser fill @e2 "$APP_PASSWORD" + ``` + +3. **Clean up after automation** + ```bash + agent-browser cookies clear + rm -f ./auth-state.json + ``` + +4. **Use short-lived sessions for CI/CD** + ```bash + # Don't persist state in CI + agent-browser open https://app.example.com/login + # ... login and perform actions ... + agent-browser close # Session ends, nothing persisted + ``` diff --git a/.agents/skills/agent-browser/references/commands.md b/.agents/skills/agent-browser/references/commands.md new file mode 100644 index 0000000..8fbfe36 --- /dev/null +++ b/.agents/skills/agent-browser/references/commands.md @@ -0,0 +1,295 @@ +# Command Reference + +Complete reference for all agent-browser commands. For quick start and common patterns, see SKILL.md. + +## Navigation + +```bash +agent-browser open # Navigate to URL (aliases: goto, navigate) + # Supports: https://, http://, file://, about:, data:// + # Auto-prepends https:// if no protocol given +agent-browser back # Go back +agent-browser forward # Go forward +agent-browser reload # Reload page +agent-browser close # Close browser (aliases: quit, exit) +agent-browser connect 9222 # Connect to browser via CDP port +``` + +## Snapshot (page analysis) + +```bash +agent-browser snapshot # Full accessibility tree +agent-browser snapshot -i # Interactive elements only (recommended) +agent-browser snapshot -c # Compact output +agent-browser snapshot -d 3 # Limit depth to 3 +agent-browser snapshot -s "#main" # Scope to CSS selector +``` + +## Interactions (use @refs from snapshot) + +```bash +agent-browser click @e1 # Click +agent-browser click @e1 --new-tab # Click and open in new tab +agent-browser dblclick @e1 # Double-click +agent-browser focus @e1 # Focus element +agent-browser fill @e2 "text" # Clear and type +agent-browser type @e2 "text" # Type without clearing +agent-browser press Enter # Press key (alias: key) +agent-browser press Control+a # Key combination +agent-browser keydown Shift # Hold key down +agent-browser keyup Shift # Release key +agent-browser hover @e1 # Hover +agent-browser check @e1 # Check checkbox +agent-browser uncheck @e1 # Uncheck checkbox +agent-browser select @e1 "value" # Select dropdown option +agent-browser select @e1 "a" "b" # Select multiple options +agent-browser scroll down 500 # Scroll page (default: down 300px) +agent-browser scrollintoview @e1 # Scroll element into view (alias: scrollinto) +agent-browser drag @e1 @e2 # Drag and drop +agent-browser upload @e1 file.pdf # Upload files +``` + +## Get Information + +```bash +agent-browser get text @e1 # Get element text +agent-browser get html @e1 # Get innerHTML +agent-browser get value @e1 # Get input value +agent-browser get attr @e1 href # Get attribute +agent-browser get title # Get page title +agent-browser get url # Get current URL +agent-browser get cdp-url # Get CDP WebSocket URL +agent-browser get count ".item" # Count matching elements +agent-browser get box @e1 # Get bounding box +agent-browser get styles @e1 # Get computed styles (font, color, bg, etc.) +``` + +## Check State + +```bash +agent-browser is visible @e1 # Check if visible +agent-browser is enabled @e1 # Check if enabled +agent-browser is checked @e1 # Check if checked +``` + +## Screenshots and PDF + +```bash +agent-browser screenshot # Save to temporary directory +agent-browser screenshot path.png # Save to specific path +agent-browser screenshot --full # Full page +agent-browser pdf output.pdf # Save as PDF +``` + +## Video Recording + +```bash +agent-browser record start ./demo.webm # Start recording +agent-browser click @e1 # Perform actions +agent-browser record stop # Stop and save video +agent-browser record restart ./take2.webm # Stop current + start new +``` + +## Wait + +```bash +agent-browser wait @e1 # Wait for element +agent-browser wait 2000 # Wait milliseconds +agent-browser wait --text "Success" # Wait for text (or -t) +agent-browser wait --url "**/dashboard" # Wait for URL pattern (or -u) +agent-browser wait --load networkidle # Wait for network idle (or -l) +agent-browser wait --fn "window.ready" # Wait for JS condition (or -f) +``` + +## Mouse Control + +```bash +agent-browser mouse move 100 200 # Move mouse +agent-browser mouse down left # Press button +agent-browser mouse up left # Release button +agent-browser mouse wheel 100 # Scroll wheel +``` + +## Semantic Locators (alternative to refs) + +```bash +agent-browser find role button click --name "Submit" +agent-browser find text "Sign In" click +agent-browser find text "Sign In" click --exact # Exact match only +agent-browser find label "Email" fill "user@test.com" +agent-browser find placeholder "Search" type "query" +agent-browser find alt "Logo" click +agent-browser find title "Close" click +agent-browser find testid "submit-btn" click +agent-browser find first ".item" click +agent-browser find last ".item" click +agent-browser find nth 2 "a" hover +``` + +## Browser Settings + +```bash +agent-browser set viewport 1920 1080 # Set viewport size +agent-browser set viewport 1920 1080 2 # 2x retina (same CSS size, higher res screenshots) +agent-browser set device "iPhone 14" # Emulate device +agent-browser set geo 37.7749 -122.4194 # Set geolocation (alias: geolocation) +agent-browser set offline on # Toggle offline mode +agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers +agent-browser set credentials user pass # HTTP basic auth (alias: auth) +agent-browser set media dark # Emulate color scheme +agent-browser set media light reduced-motion # Light mode + reduced motion +``` + +## Cookies and Storage + +```bash +agent-browser cookies # Get all cookies +agent-browser cookies set name value # Set cookie +agent-browser cookies clear # Clear cookies +agent-browser storage local # Get all localStorage +agent-browser storage local key # Get specific key +agent-browser storage local set k v # Set value +agent-browser storage local clear # Clear all +``` + +## Network + +```bash +agent-browser network route # Intercept requests +agent-browser network route --abort # Block requests +agent-browser network route --body '{}' # Mock response +agent-browser network unroute [url] # Remove routes +agent-browser network requests # View tracked requests +agent-browser network requests --filter api # Filter requests +``` + +## Tabs and Windows + +```bash +agent-browser tab # List tabs +agent-browser tab new [url] # New tab +agent-browser tab 2 # Switch to tab by index +agent-browser tab close # Close current tab +agent-browser tab close 2 # Close tab by index +agent-browser window new # New window +``` + +## Frames + +```bash +agent-browser frame "#iframe" # Switch to iframe by CSS selector +agent-browser frame @e3 # Switch to iframe by element ref +agent-browser frame main # Back to main frame +``` + +### Iframe support + +Iframes are detected automatically during snapshots. When the main-frame snapshot runs, `Iframe` nodes are resolved and their content is inlined beneath the iframe element in the output (one level of nesting; iframes within iframes are not expanded). + +```bash +agent-browser snapshot -i +# @e3 [Iframe] "payment-frame" +# @e4 [input] "Card number" +# @e5 [button] "Pay" + +# Interact directly — refs inside iframes already work +agent-browser fill @e4 "4111111111111111" +agent-browser click @e5 + +# Or switch frame context for scoped snapshots +agent-browser frame @e3 # Switch using element ref +agent-browser snapshot -i # Snapshot scoped to that iframe +agent-browser frame main # Return to main frame +``` + +The `frame` command accepts: +- **Element refs** — `frame @e3` resolves the ref to an iframe element +- **CSS selectors** — `frame "#payment-iframe"` finds the iframe by selector +- **Frame name/URL** — matches against the browser's frame tree + +## Dialogs + +By default, `alert` and `beforeunload` dialogs are automatically accepted so they never block the agent. `confirm` and `prompt` dialogs still require explicit handling. Use `--no-auto-dialog` to disable this behavior. + +```bash +agent-browser dialog accept [text] # Accept dialog +agent-browser dialog dismiss # Dismiss dialog +agent-browser dialog status # Check if a dialog is currently open +``` + +## JavaScript + +```bash +agent-browser eval "document.title" # Simple expressions only +agent-browser eval -b "" # Any JavaScript (base64 encoded) +agent-browser eval --stdin # Read script from stdin +``` + +Use `-b`/`--base64` or `--stdin` for reliable execution. Shell escaping with nested quotes and special characters is error-prone. + +```bash +# Base64 encode your script, then: +agent-browser eval -b "ZG9jdW1lbnQucXVlcnlTZWxlY3RvcignW3NyYyo9Il9uZXh0Il0nKQ==" + +# Or use stdin with heredoc for multiline scripts: +cat <<'EOF' | agent-browser eval --stdin +const links = document.querySelectorAll('a'); +Array.from(links).map(a => a.href); +EOF +``` + +## State Management + +```bash +agent-browser state save auth.json # Save cookies, storage, auth state +agent-browser state load auth.json # Restore saved state +``` + +## Global Options + +```bash +agent-browser --session ... # Isolated browser session +agent-browser --json ... # JSON output for parsing +agent-browser --headed ... # Show browser window (not headless) +agent-browser --full ... # Full page screenshot (-f) +agent-browser --cdp ... # Connect via Chrome DevTools Protocol +agent-browser -p ... # Cloud browser provider (--provider) +agent-browser --proxy ... # Use proxy server +agent-browser --proxy-bypass # Hosts to bypass proxy +agent-browser --headers ... # HTTP headers scoped to URL's origin +agent-browser --executable-path

# Custom browser executable +agent-browser --extension ... # Load browser extension (repeatable) +agent-browser --ignore-https-errors # Ignore SSL certificate errors +agent-browser --help # Show help (-h) +agent-browser --version # Show version (-V) +agent-browser --help # Show detailed help for a command +``` + +## Debugging + +```bash +agent-browser --headed open example.com # Show browser window +agent-browser --cdp 9222 snapshot # Connect via CDP port +agent-browser connect 9222 # Alternative: connect command +agent-browser console # View console messages +agent-browser console --clear # Clear console +agent-browser errors # View page errors +agent-browser errors --clear # Clear errors +agent-browser highlight @e1 # Highlight element +agent-browser inspect # Open Chrome DevTools for this session +agent-browser trace start # Start recording trace +agent-browser trace stop trace.zip # Stop and save trace +agent-browser profiler start # Start Chrome DevTools profiling +agent-browser profiler stop trace.json # Stop and save profile +``` + +## Environment Variables + +```bash +AGENT_BROWSER_SESSION="mysession" # Default session name +AGENT_BROWSER_EXECUTABLE_PATH="/path/chrome" # Custom browser path +AGENT_BROWSER_EXTENSIONS="/ext1,/ext2" # Comma-separated extension paths +AGENT_BROWSER_PROVIDER="browserbase" # Cloud browser provider +AGENT_BROWSER_STREAM_PORT="9223" # Override WebSocket streaming port (default: OS-assigned) +AGENT_BROWSER_HOME="/path/to/agent-browser" # Custom install location +``` diff --git a/.agents/skills/agent-browser/references/profiling.md b/.agents/skills/agent-browser/references/profiling.md new file mode 100644 index 0000000..bd47eaa --- /dev/null +++ b/.agents/skills/agent-browser/references/profiling.md @@ -0,0 +1,120 @@ +# Profiling + +Capture Chrome DevTools performance profiles during browser automation for performance analysis. + +**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Basic Profiling](#basic-profiling) +- [Profiler Commands](#profiler-commands) +- [Categories](#categories) +- [Use Cases](#use-cases) +- [Output Format](#output-format) +- [Viewing Profiles](#viewing-profiles) +- [Limitations](#limitations) + +## Basic Profiling + +```bash +# Start profiling +agent-browser profiler start + +# Perform actions +agent-browser navigate https://example.com +agent-browser click "#button" +agent-browser wait 1000 + +# Stop and save +agent-browser profiler stop ./trace.json +``` + +## Profiler Commands + +```bash +# Start profiling with default categories +agent-browser profiler start + +# Start with custom trace categories +agent-browser profiler start --categories "devtools.timeline,v8.execute,blink.user_timing" + +# Stop profiling and save to file +agent-browser profiler stop ./trace.json +``` + +## Categories + +The `--categories` flag accepts a comma-separated list of Chrome trace categories. Default categories include: + +- `devtools.timeline` -- standard DevTools performance traces +- `v8.execute` -- time spent running JavaScript +- `blink` -- renderer events +- `blink.user_timing` -- `performance.mark()` / `performance.measure()` calls +- `latencyInfo` -- input-to-latency tracking +- `renderer.scheduler` -- task scheduling and execution +- `toplevel` -- broad-spectrum basic events + +Several `disabled-by-default-*` categories are also included for detailed timeline, call stack, and V8 CPU profiling data. + +## Use Cases + +### Diagnosing Slow Page Loads + +```bash +agent-browser profiler start +agent-browser navigate https://app.example.com +agent-browser wait --load networkidle +agent-browser profiler stop ./page-load-profile.json +``` + +### Profiling User Interactions + +```bash +agent-browser navigate https://app.example.com +agent-browser profiler start +agent-browser click "#submit" +agent-browser wait 2000 +agent-browser profiler stop ./interaction-profile.json +``` + +### CI Performance Regression Checks + +```bash +#!/bin/bash +agent-browser profiler start +agent-browser navigate https://app.example.com +agent-browser wait --load networkidle +agent-browser profiler stop "./profiles/build-${BUILD_ID}.json" +``` + +## Output Format + +The output is a JSON file in Chrome Trace Event format: + +```json +{ + "traceEvents": [ + { "cat": "devtools.timeline", "name": "RunTask", "ph": "X", "ts": 12345, "dur": 100, ... }, + ... + ], + "metadata": { + "clock-domain": "LINUX_CLOCK_MONOTONIC" + } +} +``` + +The `metadata.clock-domain` field is set based on the host platform (Linux or macOS). On Windows it is omitted. + +## Viewing Profiles + +Load the output JSON file in any of these tools: + +- **Chrome DevTools**: Performance panel > Load profile (Ctrl+Shift+I > Performance) +- **Perfetto UI**: https://ui.perfetto.dev/ -- drag and drop the JSON file +- **Trace Viewer**: `chrome://tracing` in any Chromium browser + +## Limitations + +- Only works with Chromium-based browsers (Chrome, Edge). Not supported on Firefox or WebKit. +- Trace data accumulates in memory while profiling is active (capped at 5 million events). Stop profiling promptly after the area of interest. +- Data collection on stop has a 30-second timeout. If the browser is unresponsive, the stop command may fail. diff --git a/.agents/skills/agent-browser/references/proxy-support.md b/.agents/skills/agent-browser/references/proxy-support.md new file mode 100644 index 0000000..e86a8fe --- /dev/null +++ b/.agents/skills/agent-browser/references/proxy-support.md @@ -0,0 +1,194 @@ +# Proxy Support + +Proxy configuration for geo-testing, rate limiting avoidance, and corporate environments. + +**Related**: [commands.md](commands.md) for global options, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Basic Proxy Configuration](#basic-proxy-configuration) +- [Authenticated Proxy](#authenticated-proxy) +- [SOCKS Proxy](#socks-proxy) +- [Proxy Bypass](#proxy-bypass) +- [Common Use Cases](#common-use-cases) +- [Verifying Proxy Connection](#verifying-proxy-connection) +- [Troubleshooting](#troubleshooting) +- [Best Practices](#best-practices) + +## Basic Proxy Configuration + +Use the `--proxy` flag or set proxy via environment variable: + +```bash +# Via CLI flag +agent-browser --proxy "http://proxy.example.com:8080" open https://example.com + +# Via environment variable +export HTTP_PROXY="http://proxy.example.com:8080" +agent-browser open https://example.com + +# HTTPS proxy +export HTTPS_PROXY="https://proxy.example.com:8080" +agent-browser open https://example.com + +# Both +export HTTP_PROXY="http://proxy.example.com:8080" +export HTTPS_PROXY="http://proxy.example.com:8080" +agent-browser open https://example.com +``` + +## Authenticated Proxy + +For proxies requiring authentication: + +```bash +# Include credentials in URL +export HTTP_PROXY="http://username:password@proxy.example.com:8080" +agent-browser open https://example.com +``` + +## SOCKS Proxy + +```bash +# SOCKS5 proxy +export ALL_PROXY="socks5://proxy.example.com:1080" +agent-browser open https://example.com + +# SOCKS5 with auth +export ALL_PROXY="socks5://user:pass@proxy.example.com:1080" +agent-browser open https://example.com +``` + +## Proxy Bypass + +Skip proxy for specific domains using `--proxy-bypass` or `NO_PROXY`: + +```bash +# Via CLI flag +agent-browser --proxy "http://proxy.example.com:8080" --proxy-bypass "localhost,*.internal.com" open https://example.com + +# Via environment variable +export NO_PROXY="localhost,127.0.0.1,.internal.company.com" +agent-browser open https://internal.company.com # Direct connection +agent-browser open https://external.com # Via proxy +``` + +## Common Use Cases + +### Geo-Location Testing + +```bash +#!/bin/bash +# Test site from different regions using geo-located proxies + +PROXIES=( + "http://us-proxy.example.com:8080" + "http://eu-proxy.example.com:8080" + "http://asia-proxy.example.com:8080" +) + +for proxy in "${PROXIES[@]}"; do + export HTTP_PROXY="$proxy" + export HTTPS_PROXY="$proxy" + + region=$(echo "$proxy" | grep -oP '^\w+-\w+') + echo "Testing from: $region" + + agent-browser --session "$region" open https://example.com + agent-browser --session "$region" screenshot "./screenshots/$region.png" + agent-browser --session "$region" close +done +``` + +### Rotating Proxies for Scraping + +```bash +#!/bin/bash +# Rotate through proxy list to avoid rate limiting + +PROXY_LIST=( + "http://proxy1.example.com:8080" + "http://proxy2.example.com:8080" + "http://proxy3.example.com:8080" +) + +URLS=( + "https://site.com/page1" + "https://site.com/page2" + "https://site.com/page3" +) + +for i in "${!URLS[@]}"; do + proxy_index=$((i % ${#PROXY_LIST[@]})) + export HTTP_PROXY="${PROXY_LIST[$proxy_index]}" + export HTTPS_PROXY="${PROXY_LIST[$proxy_index]}" + + agent-browser open "${URLS[$i]}" + agent-browser get text body > "output-$i.txt" + agent-browser close + + sleep 1 # Polite delay +done +``` + +### Corporate Network Access + +```bash +#!/bin/bash +# Access internal sites via corporate proxy + +export HTTP_PROXY="http://corpproxy.company.com:8080" +export HTTPS_PROXY="http://corpproxy.company.com:8080" +export NO_PROXY="localhost,127.0.0.1,.company.com" + +# External sites go through proxy +agent-browser open https://external-vendor.com + +# Internal sites bypass proxy +agent-browser open https://intranet.company.com +``` + +## Verifying Proxy Connection + +```bash +# Check your apparent IP +agent-browser open https://httpbin.org/ip +agent-browser get text body +# Should show proxy's IP, not your real IP +``` + +## Troubleshooting + +### Proxy Connection Failed + +```bash +# Test proxy connectivity first +curl -x http://proxy.example.com:8080 https://httpbin.org/ip + +# Check if proxy requires auth +export HTTP_PROXY="http://user:pass@proxy.example.com:8080" +``` + +### SSL/TLS Errors Through Proxy + +Some proxies perform SSL inspection. If you encounter certificate errors: + +```bash +# For testing only - not recommended for production +agent-browser open https://example.com --ignore-https-errors +``` + +### Slow Performance + +```bash +# Use proxy only when necessary +export NO_PROXY="*.cdn.com,*.static.com" # Direct CDN access +``` + +## Best Practices + +1. **Use environment variables** - Don't hardcode proxy credentials +2. **Set NO_PROXY appropriately** - Avoid routing local traffic through proxy +3. **Test proxy before automation** - Verify connectivity with simple requests +4. **Handle proxy failures gracefully** - Implement retry logic for unstable proxies +5. **Rotate proxies for large scraping jobs** - Distribute load and avoid bans diff --git a/.agents/skills/agent-browser/references/session-management.md b/.agents/skills/agent-browser/references/session-management.md new file mode 100644 index 0000000..bb5312d --- /dev/null +++ b/.agents/skills/agent-browser/references/session-management.md @@ -0,0 +1,193 @@ +# Session Management + +Multiple isolated browser sessions with state persistence and concurrent browsing. + +**Related**: [authentication.md](authentication.md) for login patterns, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Named Sessions](#named-sessions) +- [Session Isolation Properties](#session-isolation-properties) +- [Session State Persistence](#session-state-persistence) +- [Common Patterns](#common-patterns) +- [Default Session](#default-session) +- [Session Cleanup](#session-cleanup) +- [Best Practices](#best-practices) + +## Named Sessions + +Use `--session` flag to isolate browser contexts: + +```bash +# Session 1: Authentication flow +agent-browser --session auth open https://app.example.com/login + +# Session 2: Public browsing (separate cookies, storage) +agent-browser --session public open https://example.com + +# Commands are isolated by session +agent-browser --session auth fill @e1 "user@example.com" +agent-browser --session public get text body +``` + +## Session Isolation Properties + +Each session has independent: +- Cookies +- LocalStorage / SessionStorage +- IndexedDB +- Cache +- Browsing history +- Open tabs + +## Session State Persistence + +### Save Session State + +```bash +# Save cookies, storage, and auth state +agent-browser state save /path/to/auth-state.json +``` + +### Load Session State + +```bash +# Restore saved state +agent-browser state load /path/to/auth-state.json + +# Continue with authenticated session +agent-browser open https://app.example.com/dashboard +``` + +### State File Contents + +```json +{ + "cookies": [...], + "localStorage": {...}, + "sessionStorage": {...}, + "origins": [...] +} +``` + +## Common Patterns + +### Authenticated Session Reuse + +```bash +#!/bin/bash +# Save login state once, reuse many times + +STATE_FILE="/tmp/auth-state.json" + +# Check if we have saved state +if [[ -f "$STATE_FILE" ]]; then + agent-browser state load "$STATE_FILE" + agent-browser open https://app.example.com/dashboard +else + # Perform login + agent-browser open https://app.example.com/login + agent-browser snapshot -i + agent-browser fill @e1 "$USERNAME" + agent-browser fill @e2 "$PASSWORD" + agent-browser click @e3 + agent-browser wait --load networkidle + + # Save for future use + agent-browser state save "$STATE_FILE" +fi +``` + +### Concurrent Scraping + +```bash +#!/bin/bash +# Scrape multiple sites concurrently + +# Start all sessions +agent-browser --session site1 open https://site1.com & +agent-browser --session site2 open https://site2.com & +agent-browser --session site3 open https://site3.com & +wait + +# Extract from each +agent-browser --session site1 get text body > site1.txt +agent-browser --session site2 get text body > site2.txt +agent-browser --session site3 get text body > site3.txt + +# Cleanup +agent-browser --session site1 close +agent-browser --session site2 close +agent-browser --session site3 close +``` + +### A/B Testing Sessions + +```bash +# Test different user experiences +agent-browser --session variant-a open "https://app.com?variant=a" +agent-browser --session variant-b open "https://app.com?variant=b" + +# Compare +agent-browser --session variant-a screenshot /tmp/variant-a.png +agent-browser --session variant-b screenshot /tmp/variant-b.png +``` + +## Default Session + +When `--session` is omitted, commands use the default session: + +```bash +# These use the same default session +agent-browser open https://example.com +agent-browser snapshot -i +agent-browser close # Closes default session +``` + +## Session Cleanup + +```bash +# Close specific session +agent-browser --session auth close + +# List active sessions +agent-browser session list +``` + +## Best Practices + +### 1. Name Sessions Semantically + +```bash +# GOOD: Clear purpose +agent-browser --session github-auth open https://github.com +agent-browser --session docs-scrape open https://docs.example.com + +# AVOID: Generic names +agent-browser --session s1 open https://github.com +``` + +### 2. Always Clean Up + +```bash +# Close sessions when done +agent-browser --session auth close +agent-browser --session scrape close +``` + +### 3. Handle State Files Securely + +```bash +# Don't commit state files (contain auth tokens!) +echo "*.auth-state.json" >> .gitignore + +# Delete after use +rm /tmp/auth-state.json +``` + +### 4. Timeout Long Sessions + +```bash +# Set timeout for automated scripts +timeout 60 agent-browser --session long-task get text body +``` diff --git a/.agents/skills/agent-browser/references/snapshot-refs.md b/.agents/skills/agent-browser/references/snapshot-refs.md new file mode 100644 index 0000000..3cc0fea --- /dev/null +++ b/.agents/skills/agent-browser/references/snapshot-refs.md @@ -0,0 +1,219 @@ +# Snapshot and Refs + +Compact element references that reduce context usage dramatically for AI agents. + +**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [How Refs Work](#how-refs-work) +- [Snapshot Command](#the-snapshot-command) +- [Using Refs](#using-refs) +- [Ref Lifecycle](#ref-lifecycle) +- [Best Practices](#best-practices) +- [Ref Notation Details](#ref-notation-details) +- [Troubleshooting](#troubleshooting) + +## How Refs Work + +Traditional approach: +``` +Full DOM/HTML → AI parses → CSS selector → Action (~3000-5000 tokens) +``` + +agent-browser approach: +``` +Compact snapshot → @refs assigned → Direct interaction (~200-400 tokens) +``` + +## The Snapshot Command + +```bash +# Basic snapshot (shows page structure) +agent-browser snapshot + +# Interactive snapshot (-i flag) - RECOMMENDED +agent-browser snapshot -i +``` + +### Snapshot Output Format + +``` +Page: Example Site - Home +URL: https://example.com + +@e1 [header] + @e2 [nav] + @e3 [a] "Home" + @e4 [a] "Products" + @e5 [a] "About" + @e6 [button] "Sign In" + +@e7 [main] + @e8 [h1] "Welcome" + @e9 [form] + @e10 [input type="email"] placeholder="Email" + @e11 [input type="password"] placeholder="Password" + @e12 [button type="submit"] "Log In" + +@e13 [footer] + @e14 [a] "Privacy Policy" +``` + +## Using Refs + +Once you have refs, interact directly: + +```bash +# Click the "Sign In" button +agent-browser click @e6 + +# Fill email input +agent-browser fill @e10 "user@example.com" + +# Fill password +agent-browser fill @e11 "password123" + +# Submit the form +agent-browser click @e12 +``` + +## Ref Lifecycle + +**IMPORTANT**: Refs are invalidated when the page changes! + +```bash +# Get initial snapshot +agent-browser snapshot -i +# @e1 [button] "Next" + +# Click triggers page change +agent-browser click @e1 + +# MUST re-snapshot to get new refs! +agent-browser snapshot -i +# @e1 [h1] "Page 2" ← Different element now! +``` + +## Best Practices + +### 1. Always Snapshot Before Interacting + +```bash +# CORRECT +agent-browser open https://example.com +agent-browser snapshot -i # Get refs first +agent-browser click @e1 # Use ref + +# WRONG +agent-browser open https://example.com +agent-browser click @e1 # Ref doesn't exist yet! +``` + +### 2. Re-Snapshot After Navigation + +```bash +agent-browser click @e5 # Navigates to new page +agent-browser snapshot -i # Get new refs +agent-browser click @e1 # Use new refs +``` + +### 3. Re-Snapshot After Dynamic Changes + +```bash +agent-browser click @e1 # Opens dropdown +agent-browser snapshot -i # See dropdown items +agent-browser click @e7 # Select item +``` + +### 4. Snapshot Specific Regions + +For complex pages, snapshot specific areas: + +```bash +# Snapshot just the form +agent-browser snapshot @e9 +``` + +## Ref Notation Details + +``` +@e1 [tag type="value"] "text content" placeholder="hint" +│ │ │ │ │ +│ │ │ │ └─ Additional attributes +│ │ │ └─ Visible text +│ │ └─ Key attributes shown +│ └─ HTML tag name +└─ Unique ref ID +``` + +### Common Patterns + +``` +@e1 [button] "Submit" # Button with text +@e2 [input type="email"] # Email input +@e3 [input type="password"] # Password input +@e4 [a href="/page"] "Link Text" # Anchor link +@e5 [select] # Dropdown +@e6 [textarea] placeholder="Message" # Text area +@e7 [div class="modal"] # Container (when relevant) +@e8 [img alt="Logo"] # Image +@e9 [checkbox] checked # Checked checkbox +@e10 [radio] selected # Selected radio +``` + +## Iframes + +Snapshots automatically detect and inline iframe content. When the main-frame snapshot runs, each `Iframe` node is resolved and its child accessibility tree is included directly beneath it in the output. Refs assigned to elements inside iframes carry frame context, so interactions like `click`, `fill`, and `type` work without manually switching frames. + +```bash +agent-browser snapshot -i +# @e1 [heading] "Checkout" +# @e2 [Iframe] "payment-frame" +# @e3 [input] "Card number" +# @e4 [input] "Expiry" +# @e5 [button] "Pay" +# @e6 [button] "Cancel" + +# Interact with iframe elements directly using their refs +agent-browser fill @e3 "4111111111111111" +agent-browser fill @e4 "12/28" +agent-browser click @e5 +``` + +**Key details:** +- Only one level of iframe nesting is expanded (iframes within iframes are not recursed) +- Cross-origin iframes that block accessibility tree access are silently skipped +- Empty iframes or iframes with no interactive content are omitted from the output +- To scope a snapshot to a single iframe, use `frame @ref` then `snapshot -i` + +## Troubleshooting + +### "Ref not found" Error + +```bash +# Ref may have changed - re-snapshot +agent-browser snapshot -i +``` + +### Element Not Visible in Snapshot + +```bash +# Scroll down to reveal element +agent-browser scroll down 1000 +agent-browser snapshot -i + +# Or wait for dynamic content +agent-browser wait 1000 +agent-browser snapshot -i +``` + +### Too Many Elements + +```bash +# Snapshot specific container +agent-browser snapshot @e5 + +# Or use get text for content-only extraction +agent-browser get text @e5 +``` diff --git a/.agents/skills/agent-browser/references/video-recording.md b/.agents/skills/agent-browser/references/video-recording.md new file mode 100644 index 0000000..e6a9fb4 --- /dev/null +++ b/.agents/skills/agent-browser/references/video-recording.md @@ -0,0 +1,173 @@ +# Video Recording + +Capture browser automation as video for debugging, documentation, or verification. + +**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start. + +## Contents + +- [Basic Recording](#basic-recording) +- [Recording Commands](#recording-commands) +- [Use Cases](#use-cases) +- [Best Practices](#best-practices) +- [Output Format](#output-format) +- [Limitations](#limitations) + +## Basic Recording + +```bash +# Start recording +agent-browser record start ./demo.webm + +# Perform actions +agent-browser open https://example.com +agent-browser snapshot -i +agent-browser click @e1 +agent-browser fill @e2 "test input" + +# Stop and save +agent-browser record stop +``` + +## Recording Commands + +```bash +# Start recording to file +agent-browser record start ./output.webm + +# Stop current recording +agent-browser record stop + +# Restart with new file (stops current + starts new) +agent-browser record restart ./take2.webm +``` + +## Use Cases + +### Debugging Failed Automation + +```bash +#!/bin/bash +# Record automation for debugging + +agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm + +# Run your automation +agent-browser open https://app.example.com +agent-browser snapshot -i +agent-browser click @e1 || { + echo "Click failed - check recording" + agent-browser record stop + exit 1 +} + +agent-browser record stop +``` + +### Documentation Generation + +```bash +#!/bin/bash +# Record workflow for documentation + +agent-browser record start ./docs/how-to-login.webm + +agent-browser open https://app.example.com/login +agent-browser wait 1000 # Pause for visibility + +agent-browser snapshot -i +agent-browser fill @e1 "demo@example.com" +agent-browser wait 500 + +agent-browser fill @e2 "password" +agent-browser wait 500 + +agent-browser click @e3 +agent-browser wait --load networkidle +agent-browser wait 1000 # Show result + +agent-browser record stop +``` + +### CI/CD Test Evidence + +```bash +#!/bin/bash +# Record E2E test runs for CI artifacts + +TEST_NAME="${1:-e2e-test}" +RECORDING_DIR="./test-recordings" +mkdir -p "$RECORDING_DIR" + +agent-browser record start "$RECORDING_DIR/$TEST_NAME-$(date +%s).webm" + +# Run test +if run_e2e_test; then + echo "Test passed" +else + echo "Test failed - recording saved" +fi + +agent-browser record stop +``` + +## Best Practices + +### 1. Add Pauses for Clarity + +```bash +# Slow down for human viewing +agent-browser click @e1 +agent-browser wait 500 # Let viewer see result +``` + +### 2. Use Descriptive Filenames + +```bash +# Include context in filename +agent-browser record start ./recordings/login-flow-2024-01-15.webm +agent-browser record start ./recordings/checkout-test-run-42.webm +``` + +### 3. Handle Recording in Error Cases + +```bash +#!/bin/bash +set -e + +cleanup() { + agent-browser record stop 2>/dev/null || true + agent-browser close 2>/dev/null || true +} +trap cleanup EXIT + +agent-browser record start ./automation.webm +# ... automation steps ... +``` + +### 4. Combine with Screenshots + +```bash +# Record video AND capture key frames +agent-browser record start ./flow.webm + +agent-browser open https://example.com +agent-browser screenshot ./screenshots/step1-homepage.png + +agent-browser click @e1 +agent-browser screenshot ./screenshots/step2-after-click.png + +agent-browser record stop +``` + +## Output Format + +- Default format: WebM (VP8/VP9 codec) +- Compatible with all modern browsers and video players +- Compressed but high quality + +## Limitations + +- Recording adds slight overhead to automation +- Large recordings can consume significant disk space +- Some headless environments may have codec limitations diff --git a/.agents/skills/agent-browser/templates/authenticated-session.sh b/.agents/skills/agent-browser/templates/authenticated-session.sh new file mode 100755 index 0000000..b66c928 --- /dev/null +++ b/.agents/skills/agent-browser/templates/authenticated-session.sh @@ -0,0 +1,105 @@ +#!/bin/bash +# Template: Authenticated Session Workflow +# Purpose: Login once, save state, reuse for subsequent runs +# Usage: ./authenticated-session.sh [state-file] +# +# RECOMMENDED: Use the auth vault instead of this template: +# echo "" | agent-browser auth save myapp --url --username --password-stdin +# agent-browser auth login myapp +# The auth vault stores credentials securely and the LLM never sees passwords. +# +# Environment variables: +# APP_USERNAME - Login username/email +# APP_PASSWORD - Login password +# +# Two modes: +# 1. Discovery mode (default): Shows form structure so you can identify refs +# 2. Login mode: Performs actual login after you update the refs +# +# Setup steps: +# 1. Run once to see form structure (discovery mode) +# 2. Update refs in LOGIN FLOW section below +# 3. Set APP_USERNAME and APP_PASSWORD +# 4. Delete the DISCOVERY section + +set -euo pipefail + +LOGIN_URL="${1:?Usage: $0 [state-file]}" +STATE_FILE="${2:-./auth-state.json}" + +echo "Authentication workflow: $LOGIN_URL" + +# ================================================================ +# SAVED STATE: Skip login if valid saved state exists +# ================================================================ +if [[ -f "$STATE_FILE" ]]; then + echo "Loading saved state from $STATE_FILE..." + if agent-browser --state "$STATE_FILE" open "$LOGIN_URL" 2>/dev/null; then + agent-browser wait --load networkidle + + CURRENT_URL=$(agent-browser get url) + if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then + echo "Session restored successfully" + agent-browser snapshot -i + exit 0 + fi + echo "Session expired, performing fresh login..." + agent-browser close 2>/dev/null || true + else + echo "Failed to load state, re-authenticating..." + fi + rm -f "$STATE_FILE" +fi + +# ================================================================ +# DISCOVERY MODE: Shows form structure (delete after setup) +# ================================================================ +echo "Opening login page..." +agent-browser open "$LOGIN_URL" +agent-browser wait --load networkidle + +echo "" +echo "Login form structure:" +echo "---" +agent-browser snapshot -i +echo "---" +echo "" +echo "Next steps:" +echo " 1. Note the refs: username=@e?, password=@e?, submit=@e?" +echo " 2. Update the LOGIN FLOW section below with your refs" +echo " 3. Set: export APP_USERNAME='...' APP_PASSWORD='...'" +echo " 4. Delete this DISCOVERY MODE section" +echo "" +agent-browser close +exit 0 + +# ================================================================ +# LOGIN FLOW: Uncomment and customize after discovery +# ================================================================ +# : "${APP_USERNAME:?Set APP_USERNAME environment variable}" +# : "${APP_PASSWORD:?Set APP_PASSWORD environment variable}" +# +# agent-browser open "$LOGIN_URL" +# agent-browser wait --load networkidle +# agent-browser snapshot -i +# +# # Fill credentials (update refs to match your form) +# agent-browser fill @e1 "$APP_USERNAME" +# agent-browser fill @e2 "$APP_PASSWORD" +# agent-browser click @e3 +# agent-browser wait --load networkidle +# +# # Verify login succeeded +# FINAL_URL=$(agent-browser get url) +# if [[ "$FINAL_URL" == *"login"* ]] || [[ "$FINAL_URL" == *"signin"* ]]; then +# echo "Login failed - still on login page" +# agent-browser screenshot /tmp/login-failed.png +# agent-browser close +# exit 1 +# fi +# +# # Save state for future runs +# echo "Saving state to $STATE_FILE" +# agent-browser state save "$STATE_FILE" +# echo "Login successful" +# agent-browser snapshot -i diff --git a/.agents/skills/agent-browser/templates/capture-workflow.sh b/.agents/skills/agent-browser/templates/capture-workflow.sh new file mode 100755 index 0000000..3bc93ad --- /dev/null +++ b/.agents/skills/agent-browser/templates/capture-workflow.sh @@ -0,0 +1,69 @@ +#!/bin/bash +# Template: Content Capture Workflow +# Purpose: Extract content from web pages (text, screenshots, PDF) +# Usage: ./capture-workflow.sh [output-dir] +# +# Outputs: +# - page-full.png: Full page screenshot +# - page-structure.txt: Page element structure with refs +# - page-text.txt: All text content +# - page.pdf: PDF version +# +# Optional: Load auth state for protected pages + +set -euo pipefail + +TARGET_URL="${1:?Usage: $0 [output-dir]}" +OUTPUT_DIR="${2:-.}" + +echo "Capturing: $TARGET_URL" +mkdir -p "$OUTPUT_DIR" + +# Optional: Load authentication state +# if [[ -f "./auth-state.json" ]]; then +# echo "Loading authentication state..." +# agent-browser state load "./auth-state.json" +# fi + +# Navigate to target +agent-browser open "$TARGET_URL" +agent-browser wait --load networkidle + +# Get metadata +TITLE=$(agent-browser get title) +URL=$(agent-browser get url) +echo "Title: $TITLE" +echo "URL: $URL" + +# Capture full page screenshot +agent-browser screenshot --full "$OUTPUT_DIR/page-full.png" +echo "Saved: $OUTPUT_DIR/page-full.png" + +# Get page structure with refs +agent-browser snapshot -i > "$OUTPUT_DIR/page-structure.txt" +echo "Saved: $OUTPUT_DIR/page-structure.txt" + +# Extract all text content +agent-browser get text body > "$OUTPUT_DIR/page-text.txt" +echo "Saved: $OUTPUT_DIR/page-text.txt" + +# Save as PDF +agent-browser pdf "$OUTPUT_DIR/page.pdf" +echo "Saved: $OUTPUT_DIR/page.pdf" + +# Optional: Extract specific elements using refs from structure +# agent-browser get text @e5 > "$OUTPUT_DIR/main-content.txt" + +# Optional: Handle infinite scroll pages +# for i in {1..5}; do +# agent-browser scroll down 1000 +# agent-browser wait 1000 +# done +# agent-browser screenshot --full "$OUTPUT_DIR/page-scrolled.png" + +# Cleanup +agent-browser close + +echo "" +echo "Capture complete:" +ls -la "$OUTPUT_DIR" diff --git a/.agents/skills/agent-browser/templates/form-automation.sh b/.agents/skills/agent-browser/templates/form-automation.sh new file mode 100755 index 0000000..6784fcd --- /dev/null +++ b/.agents/skills/agent-browser/templates/form-automation.sh @@ -0,0 +1,62 @@ +#!/bin/bash +# Template: Form Automation Workflow +# Purpose: Fill and submit web forms with validation +# Usage: ./form-automation.sh +# +# This template demonstrates the snapshot-interact-verify pattern: +# 1. Navigate to form +# 2. Snapshot to get element refs +# 3. Fill fields using refs +# 4. Submit and verify result +# +# Customize: Update the refs (@e1, @e2, etc.) based on your form's snapshot output + +set -euo pipefail + +FORM_URL="${1:?Usage: $0 }" + +echo "Form automation: $FORM_URL" + +# Step 1: Navigate to form +agent-browser open "$FORM_URL" +agent-browser wait --load networkidle + +# Step 2: Snapshot to discover form elements +echo "" +echo "Form structure:" +agent-browser snapshot -i + +# Step 3: Fill form fields (customize these refs based on snapshot output) +# +# Common field types: +# agent-browser fill @e1 "John Doe" # Text input +# agent-browser fill @e2 "user@example.com" # Email input +# agent-browser fill @e3 "SecureP@ss123" # Password input +# agent-browser select @e4 "Option Value" # Dropdown +# agent-browser check @e5 # Checkbox +# agent-browser click @e6 # Radio button +# agent-browser fill @e7 "Multi-line text" # Textarea +# agent-browser upload @e8 /path/to/file.pdf # File upload +# +# Uncomment and modify: +# agent-browser fill @e1 "Test User" +# agent-browser fill @e2 "test@example.com" +# agent-browser click @e3 # Submit button + +# Step 4: Wait for submission +# agent-browser wait --load networkidle +# agent-browser wait --url "**/success" # Or wait for redirect + +# Step 5: Verify result +echo "" +echo "Result:" +agent-browser get url +agent-browser snapshot -i + +# Optional: Capture evidence +agent-browser screenshot /tmp/form-result.png +echo "Screenshot saved: /tmp/form-result.png" + +# Cleanup +agent-browser close +echo "Done" diff --git a/.claude/skills/agent-browser b/.claude/skills/agent-browser new file mode 120000 index 0000000..e298b7b --- /dev/null +++ b/.claude/skills/agent-browser @@ -0,0 +1 @@ +../../.agents/skills/agent-browser \ No newline at end of file diff --git a/.local/ux-audit/README.md b/.local/ux-audit/README.md new file mode 100644 index 0000000..3b94595 --- /dev/null +++ b/.local/ux-audit/README.md @@ -0,0 +1,28 @@ +# UX Audit — Recoupable Marketing Website + +> Performed: 2026-03-30 +> Methodology: Fresh-eyes audit with zero prior context about the company. +> Tool: agent-browser (full-page screenshots at 1440x900 desktop + 375x812 mobile) + +## Files + +| File | What it is | +|------|-----------| +| `audit-findings.md` | The full audit — 25 findings organized by severity (Critical, Major, Design, Content, Mobile) | +| `user-stories.md` | 5 user stories (indie artist, label manager, label CEO, developer, competitor) — what each visitor needs and whether the site delivers | +| `discovery-funnels.md` | Current funnel analysis + 4 proposed funnels + traffic source gaps + metric gaps | +| `improvements-prioritized.md` | 20 improvements ranked P0/P1/P2 by impact × effort + design debt checklist | +| `page-by-page-notes.md` | Raw notes for every page — what works, what doesn't, specific copy/design callouts | +| `screenshots/` | Full-page screenshots (desktop + mobile) of every page | + +## TL;DR + +The site has strong copy but is **invisible as a product.** Zero screenshots, zero demos, zero videos, zero customer logos, zero pricing. It reads like a well-written pitch deck without the visuals. The #1 thing to do: **add a product screenshot to the homepage hero this week.** + +## Top 5 Actions + +1. Add product screenshot/video to hero +2. Add a pricing page +3. Change "Get started" to "Start free" +4. Remove empty Learn pages from nav +5. Add real customer logos diff --git a/.local/ux-audit/audit-findings.md b/.local/ux-audit/audit-findings.md new file mode 100644 index 0000000..b851173 --- /dev/null +++ b/.local/ux-audit/audit-findings.md @@ -0,0 +1,161 @@ +# UX Audit — Recoupable Marketing Website + +> Audited: 2026-03-30. Fresh-eyes perspective — no prior context about the company. +> Screenshots in `./screenshots/` + +--- + +## Executive Summary + +The site reads like a **manifesto, not a product page.** It tells me what Recoupable believes (music businesses should be autonomous) but never *shows* me the product, what it looks like, or what happens after I click "Get started." As a first-time visitor, I leave with a vague sense of ambition but no concrete understanding of what I'd actually be using. + +The design is clean but **monotone** — every page uses the same visual rhythm (heading, paragraph, heading, paragraph) with zero imagery, zero product screenshots, zero video, zero animation. It reads like a well-formatted Google Doc, not a product website competing for attention. + +**The biggest problem:** I can't tell if this is a real product or a landing page for something being built. There's no product screenshot, no demo, no video, no pricing, no customer logos (just names in text), and no way to see what I'd be getting before signing up. + +--- + +## Critical Issues (Dealbreakers) + +### 1. No Product Visuals — Anywhere + +Not a single screenshot, GIF, mockup, or video of the actual product exists on any page. The hero section is just text. Every internal page is just text. A first-time visitor has to take a complete leap of faith clicking "Get started" — they have no idea what the interface looks like, what the agent chat experience is, or what content output looks like. + +**What I expected:** A hero screenshot or video showing the agent creating content, a chat interface in action, or a before/after of agent-generated output. + +**Impact:** Massive. This alone likely kills conversion. People don't sign up for products they can't picture themselves using. + +### 2. "Get Started" Goes to... What? + +The primary CTA "Get started" links to `chat.recoupable.com` with zero context about what happens next. Is it free? Do I need a credit card? Will I see a product tour? The CTA gives me no confidence about what I'm clicking into. + +**What I expected:** "Start free" or "Try for free" or "See a demo" — something that tells me the risk level. + +### 3. No Pricing Information + +There's no pricing page. There's no mention of what it costs anywhere. The solutions page mentions "$5-10k/month for agents instead of $15-25k per hire" which implies enterprise pricing, but if I'm an indie artist, I have no idea if I can afford this. The about page says nothing about pricing tiers. + +**Impact:** I don't know if this is for me (budget-wise), so I leave. + +### 4. No Social Proof Beyond Text + +The "Trusted by music companies" section lists four names in plain text with no logos, no links, no context. This reads like someone typed company names, not like actual partnerships. The founder quote is self-referential — the founder quoting himself isn't credible social proof. + +**What I expected:** Real logos. A short customer quote from someone who isn't the founder. A case study link. Something that proves other people use this. + +### 5. Blog Has One Post + +The blog has exactly one article. This makes the entire "Latest" section on the homepage, and the dedicated blog page, feel empty and undermines credibility. An "Insights" section with one post is worse than no section at all. + +--- + +## Major Issues (Significantly Hurt Conversion/Perception) + +### 6. Wall-of-Text Pages + +Platform, Solutions, Developers, Vision, About, Records — every page is exclusively text in the same format: `

` + `

`, repeated. No icons, no illustrations, no diagrams, no code snippets (developers page), no visual hierarchy beyond font weight. Every page looks identical. The content is good, but the presentation is a Google Doc. + +### 7. "Human / Machine" Toggle is Confusing + +There's a fixed bottom bar with "Human / Machine" tabs. I have no idea what this does. Clicking "Machine" gives me... a markdown dump? Why? Who is this for? It's visible on every single page and takes up valuable screen real estate. + +**If this is for LLM crawlers**, it shouldn't be user-facing. If it's a feature demo, it needs explanation. Right now it's just confusing. + +### 8. The Homepage is Too Long + +The homepage has 11 distinct sections: Hero → Stats → Pain → Outcomes → Differentiators → Use Cases → How It Works → Proof → Logos → Blog → Subscribe → Closing. That's at least 4-5 sections too many. By the time someone scrolls to "How it works," they've already read two value prop sections and a differentiator section. Fatigue sets in. + +### 9. "Why Not Just Use ChatGPT?" Section Backfires + +This section raises a doubt the user may not have had. If they weren't comparing to ChatGPT, now they are — and the answers ("Deep artist context", "One system") are abstract enough that a skeptic would think "I could probably do this with ChatGPT and some prompts." This section needs concrete proof, not abstract claims. + +### 10. Navigation Dropdowns Are Empty + +Platform, Solutions, Developers, Learn, Company all have dropdown menus in the nav, but clicking them just navigates to the page. The dropdown behavior (hover) is inconsistent — sometimes it opens a dropdown panel, sometimes it's just a link. On mobile, they collapse to simple links which is fine, but the desktop experience feels half-built. + +### 11. No Clear Funnel for Different Audiences + +The site talks to artists, labels, distributors, catalog owners, AND developers all on the same homepage. Each audience has very different needs, budgets, and decision processes. There's no way to self-select into a path. The Solutions page separates them, but the homepage mixes everything together. + +--- + +## Design Issues (Cheap / Forgettable) + +### 12. Zero Color + +The entire site is black text on white background with one accent color (`#345A5D`, a dark teal) that only appears in borders and link text. No gradients, no brand color blocks, no colored sections, no hover effects worth noticing. The dark mode toggle exists but the light mode is so austere it feels unfinished. + +### 13. No Imagery or Media of Any Kind + +Not a single image on any page (except the favicon and logo). No hero image, no product screenshots, no team photos, no artist imagery, no album art, no content examples. For a product that creates visual content (album visualizers, social posts), showing zero visual content is ironic and damaging. + +### 14. Footer Social Links Are Text Abbreviations + +The social media links in the footer are plain text: "𝕏 IG in YT". These look like a placeholder, not a finished design. They should be recognizable icons or at minimum styled consistently. + +### 15. No Animations or Transitions + +Nothing animates on scroll. No fade-ins, no parallax, no entrance effects. Elements just exist statically. The site feels like it was built for speed (which is good) but the result is that nothing draws the eye or creates moments of delight. + +### 16. Inconsistent CTA Styling + +The hero CTA is a rounded black pill button. The closing CTA is a smaller rounded black pill. Internal page CTAs are square-cornered dark buttons. This inconsistency makes the CTAs feel unintentional rather than designed. + +### 17. Cards and Sections Lack Visual Distinction + +The "Content that runs" outcome card has a teal left border. The other three outcome cards have thin gray borders. The differentiator cards have thin gray borders. The solution cards have thin gray borders with a hover that turns teal. Everything is variations of "thin border box" — nothing stands out. + +--- + +## Content Issues (Confusing / Unclear) + +### 18. "Agents" is Never Defined + +The word "agents" appears in every section — hero, outcomes, use cases, how it works. But nowhere does the site explain what an "agent" actually is or does in practical terms. An indie artist visiting this site likely thinks "agent" means a music manager/booking agent, not an AI process. This terminology gap could lose the primary audience. + +### 19. Stat Section Raises Questions It Doesn't Answer + +- "22 videos in one session" — What kind of videos? Of what quality? Can I see one? +- "120k+ tracks hit DSPs daily" — This is an industry stat, not about Recoupable. Why is it in the company's stats section? +- "$15-25k per hire, per month" — This framing only resonates with someone who manages a label, not an indie artist. + +### 20. "Deep Artist Context" Means Nothing to a New Visitor + +The concept of "face guide, brand docs, songs, audience data" stored in a context system is the moat — but it's described so abstractly that a visitor doesn't understand what they'd actually DO. Show me: "Upload your music. Add your face. The agent creates content that looks and sounds like you." + +### 21. Company Pages Are Skeleton Thin + +- **Vision:** Five paragraphs of philosophy, no images, no timeline, no roadmap visual. +- **About:** One paragraph, a founder bio with no photo, and a mission statement. Feels like a placeholder. +- **Recoupable Records:** Mentions "Gatsby Grace" and "22 videos" but I can't see Gatsby Grace, hear any music, or see any of the videos. The proof point has no proof. + +### 22. Learn Pages Are Empty + +Use Cases, Playbooks, and Demos are all placeholder pages that say "coming soon" or point to the blog (which has one post). These shouldn't be in the navigation if there's no content. + +--- + +## Mobile-Specific Issues + +### 23. Stats Section Stacks Poorly + +On mobile (375px), the three stats stack vertically which is fine, but they take up a lot of vertical space before the user gets to any actual content. The stats feel less impactful when you can only see one at a time. + +### 24. Human/Machine Bar Covers Content + +The fixed-position Human/Machine toggle bar at the bottom of the viewport overlaps content on mobile. This is especially problematic on shorter screens. + +### 25. No Hamburger Menu + +On mobile, the nav items collapse into a horizontal scrollable row under the header. This works but doesn't scale — if more nav items are added, it'll overflow. A hamburger menu is more scalable. + +--- + +## What's Actually Good + +- **The copy is strong.** The pain points are specific and real. "You got into music. You're stuck in ops." is genuinely good. +- **The messaging hierarchy is correct.** Hero → problem → solution → proof → CTA is the right flow. +- **Performance is excellent.** Pages load in under 200ms. No layout shift. No unnecessary JS. +- **The domain structure is clean.** Good URL patterns, proper metadata, RSS feed. +- **Dark mode exists** and works without flash. +- **The Machine view** is actually innovative (LLM-readable markdown) — just needs to not be a confusing UI element. diff --git a/.local/ux-audit/brutal-critique.md b/.local/ux-audit/brutal-critique.md new file mode 100644 index 0000000..e35668a --- /dev/null +++ b/.local/ux-audit/brutal-critique.md @@ -0,0 +1,87 @@ +# Brutal Critique — Current Site vs Swipe Files + +## The Gap Is Enormous + +Looking at mood-9 ("THE LOGIC LAYER FOR MODERN POP CULTURE") and mood-10 ("SIGN THE ALGORITHM") vs what we have — we're not even in the same league. Here's why: + +### 1. Our headline is TINY compared to the references + +Mood-9 headline fills 70% of the viewport height. Ours fills maybe 30%. The references use `clamp(6rem, 15vw, 18rem)` — edge-to-edge type that DEMANDS attention. Our `text-display` maxes out at 5rem. That's a heading, not a statement. + +**Fix:** The headline needs to be MASSIVE. Like, uncomfortable big. `font-size: clamp(4rem, 12vw, 12rem)`. Take up the whole screen. + +### 2. The AgentChat mockup looks like a homework project + +The references don't have fake chat UIs. They have: glitch typography (mood-10), live ingest feeds with timestamps, massive type, and split-panel layouts with real data. Our AgentChat is a gray box with tiny text that pretends to be a product. It looks like a Figma wireframe, not a real product. + +**Fix:** Kill the AgentChat. Replace the hero right-side with either: (a) the live ingest feed pattern from mood-10, or (b) nothing — just let the massive headline breathe. The best hero in our swipe files (mood-9) has NO right-side visual. Just massive type + grid background. + +### 3. The VisionOverlay says "ARTIST PROFILE" in plain text + +The reference (mood-8) has an actual photograph of an artist on a road — B&W, cinematic, emotional. Our version has a dark rectangle that says "ARTIST PROFILE" in gray text. That's the OPPOSITE of what makes the reference powerful. The reference works because there's a REAL HUMAN inside the bounding box. Without the photo, the bounding box is just a rectangle. + +**Fix:** Either add a real photo (or a very good AI-generated one), or remove this section entirely. An empty bounding box is worse than no bounding box. + +### 4. Too many sections — the page has no BREATHING ROOM + +Mood-9 has: hero → 3 modules → footer. That's it. Four sections. Our page has ELEVEN sections. The best websites are RESTRAINED. Every section we add dilutes the ones before it. The page scrolls forever and every section looks the same — dark background, white text, yellow accent. + +**Fix:** Cut the page in half. Keep: Hero, Marquee, Modules, Terminal. Kill or radically simplify: Proof stat, VisionOverlay, SystemDiagram, Segments. Merge: Subscribe + CTA into the footer. + +### 5. The copy is still corporate + +"AI-native label infrastructure. You create the music. Agents run strategy, content, fans, revenue." — this reads like a LinkedIn post. Compare to mood-9: "Recoupable is an AI-native record label and agent infrastructure. We algorithmicize A&R, automate global royalty routing, and execute artist growth via immutable smart contracts." — THAT sounds like infrastructure. Or mood-10: "We replace gut feelings with predictive modeling, and manual royalty accounting with smart contracts. We fund music, calculated by machines." — THAT sounds dangerous and real. + +**Fix:** Rewrite the subheader. Less "you create the music" (patronizing), more "we compute hits" (confrontational). + +### 6. The color scheme is ONE-DIMENSIONAL + +Every section: black background + white text + yellow accents. There's no variation. The references use: +- Mood-10: white background + black text + green accents (inverted!) +- Mood-5: steel blue + coral (warm, editorial) +- Mood-3: dark glass-morphism (depth, layers) + +Our site is monotone. Every section looks the same from a distance. + +**Fix:** Introduce a SECOND mode — at least one section should have a white/light background with dark text. The split between dark and light creates rhythm. The mood-10 reference is white with a dark data panel — that contrast is what makes it visually interesting. + +### 7. The typography is not distinctive enough + +We use `.mix` with italic vs bold weights. The references use: +- Mood-9: mixed pixel font + condensed sans (actual different typefaces) +- Mood-10: outlined/stroke letters + filled letters + ASCII art glitch overlay +- Mood-1: hand-drawn neon glow lettering + +Our ".mix" is just italic DM Sans vs bold Plus Jakarta Sans. They look too similar. There's no real tension. + +**Fix:** Import a pixel/retro font (Silkscreen, Press Start 2P, or VT323) and actually MIX it with the sans-serif in the headline. Alternating individual letters between fonts, like the brutalist branch did: R[pixel E]C[pixel O]U. + +### 8. No WHITE SPACE anywhere + +The references — especially Linear (mood-17, 18) — use ENORMOUS amounts of white space. Between the headline and the next section, there's BREATHING ROOM. Our sections are packed tight with `section-spacing` which tops out at 5rem. The references have 10-20rem between sections. + +**Fix:** More space. Let the page breathe. The hero should stand alone with massive padding below it. The modules should have generous space around them. + +## The Real Problem + +We've been adding things. Components, sections, animations, utilities. But the best websites SUBTRACT. They have fewer sections, fewer words, fewer colors, more space, more impact. + +The mood-9 design has: +- ONE massive headline +- ONE subheader paragraph +- ONE CTA button +- ONE grid background +- THREE module cards + +That's it. Five elements. And it's more powerful than our eleven sections combined. + +## What To Do Next + +Stop adding. Start removing. Then make what remains PERFECT. + +1. Kill sections: VisionOverlay, SystemDiagram, Segments, Blog (move blog to its own page only) +2. Hero: make the headline 3x bigger, kill the AgentChat, add a real live ingest feed instead +3. One white/light section for contrast (the modules) +4. Import a pixel font and use it in the headline +5. Double the spacing between sections +6. Rewrite the copy to be confrontational, not corporate diff --git a/.local/ux-audit/discovery-funnels.md b/.local/ux-audit/discovery-funnels.md new file mode 100644 index 0000000..3f0bb92 --- /dev/null +++ b/.local/ux-audit/discovery-funnels.md @@ -0,0 +1,98 @@ +# Discovery Funnels — How Users Find and Navigate the Site + +--- + +## Current Funnel (What Exists) + +``` +[Traffic source] → Homepage → ??? → "Get started" → chat.recoupable.com +``` + +That's it. There is one funnel. Every visitor lands on the homepage, scrolls through text, and either clicks "Get started" (goes to app) or bounces. There's no intermediate step — no demo, no video, no pricing page, no self-selection path. + +### Conversion Killers in This Funnel + +1. **No intermediate engagement.** It's all-or-nothing: sign up or leave. +2. **No lead capture before CTA.** The subscribe form exists but it's buried at the bottom and only captures email for a newsletter — not for product interest. +3. **No retargeting hook.** No quiz, no tool, no calculator, no "see what agents can do for your artist" interactive element. +4. **The CTA is ambiguous.** "Get started" could mean anything. No mention of free trial, no-credit-card, or what happens next. + +--- + +## Proposed Funnels + +### Funnel A: Indie Artist (Low-Touch, Self-Serve) + +``` +Instagram/X ad → Homepage → "See it in action" (demo video) → Pricing page → "Start free" → App + ↳ Subscribe form (if not ready) +``` + +**Missing pieces:** +- [ ] Demo video or interactive product tour +- [ ] Pricing page with a free tier +- [ ] "Start free" CTA (instead of generic "Get started") +- [ ] Content gallery showing real agent output + +### Funnel B: Label/Distributor (High-Touch, Enterprise) + +``` +Referral/conference → Homepage → Solutions (For Labels) → Case Study → "Talk to us" → Meeting + ↳ Download whitepaper (lead capture) +``` + +**Missing pieces:** +- [ ] Case study page (even one anonymized case study) +- [ ] "Talk to us" / "Book a demo" CTA on Solutions page +- [ ] Enterprise-specific landing page +- [ ] Lead capture form (not just newsletter subscribe) +- [ ] ROI calculator ("How much are you spending on marketing staff?") + +### Funnel C: Developer (API-First) + +``` +Google/HN/X → Developers page → Code example → Docs site → API key signup → Build +``` + +**Missing pieces:** +- [ ] Code snippet on the marketing page (curl, npm install, quick start) +- [ ] Direct link to get API key +- [ ] "Free for developers" messaging +- [ ] Architecture diagram or system overview + +### Funnel D: Content/SEO (Organic Discovery) + +``` +Google "AI music marketing" → Blog post → Internal link to product → Homepage → CTA +``` + +**Missing pieces:** +- [ ] More than 1 blog post (need 10-20 for SEO traction) +- [ ] In-post CTAs linking to product features mentioned +- [ ] Comparison pages (Recoupable vs CoManager, vs ChatGPT for music) +- [ ] SEO landing pages for key terms + +--- + +## Traffic Source Analysis (Gaps) + +| Source | Status | Gap | +|--------|--------|-----| +| Organic search | 1 blog post. Essentially zero SEO presence. | Need 10+ posts targeting "AI music marketing", "music content automation", etc. | +| Social (X, IG) | Unknown — no tracking visible beyond Plausible | Need UTM-tagged social campaigns with specific landing pages | +| Referral (word of mouth) | Primary source based on strategy docs | Site needs to support "my friend told me about this" visitors with a quick explainer | +| Direct (investors/fundraise) | April podcast → site traffic spike likely | Need an investor-facing page or at least a press/media section | +| Paid ads | None currently | Not recommended until activation funnel is figured out | + +--- + +## Key Metric Gaps + +The site has Plausible analytics but likely isn't tracking: + +- [ ] CTA click rates (which "Get started" button converts?) +- [ ] Scroll depth (do people see the proof section? the subscribe form?) +- [ ] Time on page by section (what do people read vs. skip?) +- [ ] Exit pages (where do people leave?) +- [ ] Subscribe form conversion rate +- [ ] Nav dropdown interaction rates diff --git a/.local/ux-audit/finding-our-own-voice.md b/.local/ux-audit/finding-our-own-voice.md new file mode 100644 index 0000000..57c5340 --- /dev/null +++ b/.local/ux-audit/finding-our-own-voice.md @@ -0,0 +1,75 @@ +# Finding Our Own Visual Voice + +## The Problem + +We've been copying reference sites instead of inventing our own aesthetic. "THE LOGIC LAYER FOR MODERN POP CULTURE" is literally from our own swipe file — it's not a real headline, it's a concept we liked. The grid background + yellow box + massive type combo is the mood-9 design verbatim. We have no original visual identity. + +## What's ACTUALLY unique about Recoupable? + +Not what other companies do. What WE do that nobody else does: + +1. **We're a record label AND an infrastructure company.** Nobody else is both. The tension between "culture" and "compute" is our thing. Not Vercel's infra-only. Not CoManager's consumer-only. We're at the intersection. + +2. **We have Gatsby Grace.** A real AI artist with real output. 22 real videos. This is proof nobody else has. + +3. **Sidney is a musician AND a founder.** The product was built out of personal need — not out of a market analysis. + +4. **The agent system is real and running.** Not a concept. Not a pitch deck. Live on Spotify. Live on socials. + +5. **Music is emotional. Infrastructure is rational.** We bridge both worlds. Most infra companies are cold. Most music companies are warm. We're both. + +## Extracting PRINCIPLES from swipe files (not copying) + +| Reference | Surface (what we copied) | Principle (what we should learn) | +|-----------|-------------------------|----------------------------------| +| Mood-9 | Grid bg, yellow box, "logic layer" text | **Confidence** — one clear statement, no hedging | +| Mood-8 | B&W photo + bounding box | **The system sees culture** — AI perception of the real world | +| Mood-3 | Glass cards floating | **Spatial depth** — layers create richness without images | +| Mood-1 | Neon text over illustration | **Brand name as the hero visual** — the name IS the design | +| Mood-5 | Serif headline + geometric shape | **Typography as the ONLY design element** — no graphics needed | +| Mood-17 | Dimmed/bright text | **Hierarchy through opacity** — one clause matters, the rest is context | +| Mood-10 | Live feed + split layout | **The system is alive** — show it working, not describe it | + +## What should Recoupable's OWN visual language be? + +It should express: **the collision of music culture and autonomous infrastructure.** + +Not "infra company that happens to be in music" (that's what we have now). +Not "music company that uses AI" (that's the old Ghibli design). +The COLLISION. The tension. The beautiful friction between art and algorithm. + +### Ideas for an ORIGINAL direction: + +**Direction A: "The Waveform"** +What if the core visual element was a WAVEFORM — an audio waveform that morphs into a data visualization? Not a static image, but a live, generative CSS animation. The brand IS the waveform. It pulses. It reacts. It represents both the music (audio) and the system (data). No other company uses this because no other company is both a record label and an infrastructure platform. + +**Direction B: "The Score"** +What if the site looked like a musical score — staves, notes, but the notes are data points? The headline sits on a staff. The sections are movements. The scrolling IS the composition. Musical structure as information architecture. + +**Direction C: "The Studio"** +What if the aesthetic was a recording studio console? Faders, meters, VU displays. Each section is a "channel" on the board. The modules are EQ knobs. The system is the mixer. Dark, warm (not cold), with amber/gold accents instead of electric yellow (amber is the color of VU meters and analog warmth). + +**Direction D: "Night Session"** +What if the brand was about the creative moment — midnight in the studio, laptop open, headphones on? Dark, intimate, warm. Deep navy/black instead of pure black. Soft amber/warm white instead of cool white. The brand is about the 2am session where music meets machine. Not corporate. Not infrastructure. Personal. + +## My instinct + +Direction D feels right for Recoupable. Here's why: + +- Sidney's story is about sitting down at 10pm and having 22 videos by midnight +- The brand voice says "sounds like a smart friend who works in music tech — been in the studio at 2am AND understands analytics" +- The music industry operates at night. Studios, clubs, late sessions. +- "Night Session" is warm but technical. Intimate but powerful. +- The color palette would shift from cold electric yellow to warm amber/gold — differentiating from every other dark-mode tech site +- The type could be more editorial — a serif headline that feels like a magazine, not a terminal + +## Proposed palette: Night Session + +- Background: deep navy-black (#0c0f14) — not pure black, has blue undertone like a night sky +- Foreground: warm cream (#f0ebe3) — not cool white, has warmth like studio light +- Accent: amber gold (#d4a843) — VU meter amber, warm, analog +- Secondary accent: soft ember (#c45d3e) — for emphasis, like the red light on a recording console +- Muted: (#1a1f2a) — navy-tinted dark +- Border: (#2a2f3a) — navy-tinted border + +This is COMPLETELY different from the electric yellow/neon tech look we have now. It's also different from every reference in the swipe file. It would be OURS. diff --git a/.local/ux-audit/flywheel-notes.md b/.local/ux-audit/flywheel-notes.md new file mode 100644 index 0000000..68b4346 --- /dev/null +++ b/.local/ux-audit/flywheel-notes.md @@ -0,0 +1,91 @@ +# Flywheel Notes + +## Loop 1 — Full scroll audit + +**What's working:** +- Hero section: dark bg + grid texture + mixed-type headline + AgentChat mockup = strong first impression +- "22" proof stat with yellow glow is bold and clear +- Module cards (INGEST/CREATE/DEPLOY) with hover-inversion are clean infrastructure pattern +- Terminal with refined colors (white text, yellow SUCCESS) looks professional, not hacky +- SystemDiagram (CONNECT → PROCESS → DEPLOY) with yellow arrows and numbers is excellent +- Blog post card with yellow border works on dark theme +- Subscribe form with yellow button matches +- Closing CTA with npm install + mixed-type headline + yellow button is strong +- Footer reads well + +**Issues found:** +1. Hero: AgentChat mockup renders in viewport but full-page screenshots show it as "PRODUCT SCREENSHOT" text — CSS animations may not trigger in headless. Not a real user issue. +2. Sections between hero and blog appear invisible in full-page screenshot due to similar darkness. When scrolled to, they're fine. This is a "zoomed out" visual density issue, not a contrast issue. +3. Too much spacing between sections — large gaps of pure black create an endless-scroll feeling +4. The visualizer placeholders (3 tall thin rectangles in CREATE module) look empty/unfinished +5. The "BUILT FOR" segments section needs visual separation from the diagram above it — both are borderless and blend together +6. No hover states visible on module cards in screenshots (they work when interacted with) +7. Subscribe section has too much vertical padding — push email form closer to the CTA below +8. Footer social links still text abbreviations (𝕏 IG in YT) + +**Priority fixes for Loop 2:** +- [ ] Reduce section spacing globally — tighter page flow +- [ ] Add subtle visual content to the CREATE module visualizer boxes (small SVG waveforms or patterns) +- [ ] Add border/separator between diagram and segments sections +- [ ] Merge subscribe + closing CTA into one section (less scrolling) +- [ ] Remove "Learn" from nav (empty sub-pages) + +## Loop 2 — Spacing and density fixes +- Reduced section-spacing globally (7rem max → 5rem max) +- Added visual content to CREATE module visualizer boxes (SVG-like bar chart patterns with yellow bars) +- Added border separator between diagram and segments sections +- Changed "Learn" nav to "Blog" (direct link, no dropdown, empty sub-pages hidden) +- Subscribe + CTA already merged (no change needed) + +## Loops 3-6 — Detail polish +- Hero section: added border-b for visual separation before proof strip +- "22" number: added text glow via drop-shadow for depth +- Module cards: min-h-[320px] for equal heights + rounded-lg corners +- Terminal container: rounded-lg + overflow-hidden for consistent chrome +- Segments: added yellow left border accent (border-l-2 border-[var(--brand)]/30 pl-4) +- NavDropdown: renders as plain Link when items array is empty (Blog link, no dropdown) +- nav.ts: typed with explicit NavItem/NavSection types to handle empty items arrays +- Fixed mobile nav order and labels + +## Loops 7-8 — Full scroll verification +- AgentChat mockup rendering with animated messages (Gatsby Grace, 66% progress bar) +- "22" proof stat with yellow glow is the strongest visual element +- Module cards (INGEST/CREATE/DEPLOY) look professional with bars in CREATE +- Terminal with refined colors (white text, yellow SUCCESS) is polished +- SystemDiagram (CONNECT → PROCESS → DEPLOY) with yellow arrows is clean +- Segments have yellow left border accent — subtle but present +- Blog card with yellow border works on dark theme +- Subscribe form with yellow button matches +- Closing CTA with npm install + mixed-type headline + yellow button is strong +- Footer reads well +- Nav simplified: Blog instead of Learn, no empty dropdowns +- Contrast improved with brighter --muted-foreground (#a1a1aa) and --border (#3f3f46) + +## Loop 9 — Remaining observations +- Page feels complete and cohesive +- The dark-native aesthetic is working well across all sections +- The yellow accent is used sparingly enough to feel intentional +- The mixed typography in hero and closing is distinctive +- Product mockup (AgentChat) gives immediate product context +- Terminal animation makes the product feel alive +- Infrastructure coding pattern (INGEST/CREATE/DEPLOY) tells a clear story +- Copy is narrative-driven: one story, not 11 random sections +- No more teal anywhere +- Subscribe + npm install + CTA is a strong closing sequence + +## Loop 10 — Final state +The homepage has gone through 10 iterations: +1. Original: teal text-only Google Doc +2. Added AI-generated images (rejected by user) +3. Added strategy-driven copy +4. Switched to dark-native with yellow accent +5. Added AgentChat mockup, Terminal, SystemDiagram components +6. Fixed contrast, spacing, visualizer bars +7. Polished details: borders, glow, card heights, nav +8. Verified full scroll render +9. Confirmed cohesive design language +10. Final notes written + +The site is ready for real product screenshots when they're available. +The placeholders (AgentChat mockup, visualizer bars, SystemDiagram) serve as +effective stand-ins that look intentional rather than broken. diff --git a/.local/ux-audit/improvements-prioritized.md b/.local/ux-audit/improvements-prioritized.md new file mode 100644 index 0000000..a6e14a0 --- /dev/null +++ b/.local/ux-audit/improvements-prioritized.md @@ -0,0 +1,143 @@ +# Improvements — Prioritized + +> Organized by impact × effort. P0 = do this week. P1 = this month. P2 = next month. + +--- + +## P0 — Critical (Do This Week) + +### 1. Add a Product Screenshot or Video to the Hero + +**Impact:** Highest. The #1 reason visitors bounce is they can't visualize the product. +**Effort:** Low-Medium. Take a screenshot of the chat interface. Record a 30-second Loom of the agent creating content. +**Where:** Homepage hero, right side or below the subheader. +**How:** Even a static PNG mockup of the chat UI with an agent response is better than nothing. + +### 2. Add Pricing Page (Even Minimal) + +**Impact:** High. Visitors with budget questions leave immediately when there's no pricing. +**Effort:** Low. A simple page with 2-3 tiers: +- Free: X credits/month, 1 artist +- Pro: $20/month, more credits, more artists +- Enterprise: "Talk to us" — custom pricing +**Where:** New page at `/pricing`, add to nav. + +### 3. Change CTA from "Get started" to "Start free" + +**Impact:** High. Reduces friction and answers the implicit question "does this cost money?" +**Effort:** Trivial. One line change in `home.ts`. + +### 4. Remove Empty Learn Pages from Navigation + +**Impact:** Medium. Empty pages destroy credibility. +**Effort:** Trivial. Remove Use Cases, Playbooks, Demos from nav until content exists. +**Alternative:** Collapse Learn into just Blog in the nav. + +### 5. Fix Human/Machine Toggle Visibility + +**Impact:** Medium. It's confusing for 99% of visitors. +**Effort:** Low. Hide it by default. Add a small "View as markdown" link in the footer for those who want it. + +--- + +## P1 — Important (This Month) + +### 6. Add Real Visuals to Every Page + +- **Homepage:** Product screenshot in hero, example content outputs (show a generated video thumbnail, a social post) +- **Platform:** Diagram showing how data flows (connect → agents → output) +- **Solutions:** Show example output per persona (artist social post, label dashboard view) +- **Developers:** Code snippet (actual curl/npm example), architecture diagram +- **Records:** Show Gatsby Grace content — screenshots of the 22 videos, link to socials + +### 7. Create One Real Case Study + +Even anonymized: "A mid-size label with 15 artists used Recoupable to generate 22 album visualizers in 2 hours. Before: Stephanie spent 10 hours/week on content. After: 2 hours." + +Put it at `/company/case-studies` or on the Solutions page. + +### 8. Add Customer Logos (Real Images) + +Replace the text-only "Trusted by" section with actual logo images. Even small, gray, opacity-reduced logos are standard social proof. + +### 9. Add a "Book a Demo" CTA for Enterprise + +The Solutions page should have a "Talk to us" or "Book a demo" button for labels/distributors/catalog owners. This separates the enterprise funnel from the self-serve funnel. + +### 10. Consolidate Homepage Sections + +Current: 11 sections. Target: 7-8. +- Merge "Pain" into the hero (make it a subheader list) +- Cut "Why not just use ChatGPT?" (raises doubt) +- Merge "Logos" into "Proof" section +- Keep: Hero → Stats → Outcomes → Use Cases → How It Works → Proof + Logos → Blog → Subscribe + Closing + +### 11. Add Page-Level Visual Variety + +- Alternate between full-width and contained sections +- Use the brand color (#345A5D) as a background for one section per page +- Add subtle borders, shadows, or background textures to break up the text walls +- Icons next to feature headings (even simple SVG icons) + +### 12. Write 3-5 More Blog Posts + +Target keywords: +- "AI music marketing tools 2026" +- "How to automate music content" +- "Label operations software" +- "AI content creation for artists" +- "Recoupable vs CoManager" (comparison) + +--- + +## P2 — Nice to Have (Next Month) + +### 13. Add Scroll Animations + +Fade-in sections on scroll. Staggered list item entrance. Number counter animation for stats. These are cosmetic but they make the difference between "this feels alive" and "this feels static." + +### 14. Build an Interactive Product Tour + +A "See how it works" section that walks through the 3 steps with animated mockups or an embedded product sandbox. Even fake/static mockups that switch on click. + +### 15. Add a "Who is this for?" Quiz + +A simple 3-question quiz: "Are you an artist, label, or developer?" → routes to the right Solutions section or a personalized CTA. + +### 16. Developer-Specific Improvements + +- npm install snippet with copy button +- curl example for the API +- Architecture diagram +- GitHub link (if public) +- "Built with Recoupable" showcase section + +### 17. Footer Social Icons + +Replace "𝕏 IG in YT" text with real SVG icons. This is a small detail but it's the kind of thing that makes a site feel finished vs. in-progress. + +### 18. Mobile Hamburger Menu + +Replace the horizontal scroll nav with a proper hamburger menu for mobile. More scalable and standard. + +### 19. SEO Comparison Pages + +- "Recoupable vs CoManager" +- "Recoupable vs ChatGPT for music" +- "Recoupable vs hiring a social media manager" + +### 20. Newsletter Value Prop + +"Stay in the loop" is generic. Change to something specific: "Get the weekly rundown: one AI music insight, one agent workflow idea, one industry number. Every Tuesday." + +--- + +## Design Debt (Technical Cleanup) + +- [ ] Tailwind v4 shorthand warnings (103 instances of `text-[var(--foreground)]` → `text-(--foreground)`) +- [ ] CTA button style inconsistency (rounded-full vs rounded-md) +- [ ] Nav dropdown behavior inconsistency on desktop +- [ ] Human/Machine toggle overlapping content on mobile +- [ ] Empty Learn sub-pages in sitemap (bad for SEO) +- [ ] Missing `alt` attributes for any future images +- [ ] No `` on placeholder pages diff --git a/.local/ux-audit/iteration-log.md b/.local/ux-audit/iteration-log.md new file mode 100644 index 0000000..e422651 --- /dev/null +++ b/.local/ux-audit/iteration-log.md @@ -0,0 +1,35 @@ +# Iteration Log — Mind-Blowing Rebuild + +## Iteration 1 — Major Component Integration + +**New components added:** +- `StatusBar.tsx` — System status indicators (SYS.STATUS: ONLINE, version, MCP tools) +- `VisionOverlay.tsx` — Signature B&W bounding box + terminal readout (the WOW element) +- `Marquee.tsx` — Scrolling ticker with infrastructure keywords +- `FigureLabel.tsx` — Linear-style "FIG 0.1" section labels + +**Homepage redesign:** +11 sections, each with figure labels, alternating backgrounds, fade-in animations: +STATUS → HERO → MARQUEE → PROOF → VISION → MODULES → TERMINAL → DIAGRAM → SEGMENTS → BLOG → CTA + +**What's working:** +- StatusBar creates immediate "this is a real system" credibility +- The VisionOverlay with bounding box + terminal readout is the most distinctive element on the page — nothing else in the music industry looks like this +- Marquee adds energy and movement between sections +- Figure labels (FIG 0.1 - 0.6) treat the page like a technical document — sophisticated +- The overall narrative flows clearly: system status → headline → proof → AI vision → infrastructure → live demo → how it works → who it's for → CTA +- Electric yellow accent is used consistently: CTA buttons, tags, numbers, status indicators, bounding box + +**Issues to fix in next iteration:** +1. VisionOverlay needs a real photo — the "ARTIST PROFILE" placeholder text is the weakest visual on the page. When a real B&W artist photo is dropped in, this section will be extraordinary. +2. The sections between VisionOverlay and Terminal could use tighter spacing +3. Blog section only has one post — not an issue with design but with content +4. Footer social icons are still text abbreviations +5. Mobile responsiveness hasn't been tested +6. Other pages (platform, solutions, developers, company) haven't been updated to match the new design level + +## Next priorities: +- [ ] Test mobile layout +- [ ] Update other pages to match homepage quality +- [ ] Add more subtle details: noise textures, gradient overlays, micro-interactions +- [ ] Refine the VisionOverlay — add scan-line animations, more terminal data diff --git a/.local/ux-audit/loop1-critique.md b/.local/ux-audit/loop1-critique.md new file mode 100644 index 0000000..ee55825 --- /dev/null +++ b/.local/ux-audit/loop1-critique.md @@ -0,0 +1,51 @@ +# Loop 1 Critique + +## What's Working +- Dark hero with mixed typography is bold and distinctive +- Electric yellow accent is striking — no more generic teal +- Product screenshot in the hero gives immediate product context +- "22 videos in one session" proof strip is bold and undeniable +- Three infrastructure modules (INGEST/CREATE/DEPLOY) tell a clear story +- Terminal animation gives a sense of "this is a real system" +- Copy is dramatically cleaner — one narrative, not 11 random sections +- npm install command makes it feel like real developer infrastructure + +## What Needs Fixing + +### 1. Light mode is the default — should be dark-native +The screenshot shows a light/warm background, not the dark cinematic look we wanted. The hero section is dark but the rest of the page is light. The ENTIRE page should default to dark. The CSS :root should default to dark values. + +### 2. Hero mixed typography needs polish +The "YOUR LABEL. RUN BY AGENTS." text mixes italic and colored words but it's not quite the brutalist mixed-font effect. Need the pixel font letters interspersed with the sans-serif — like the brutalist design had (R[pixel E]C[pixel O]U). + +### 3. Modules grid is messy on the screenshot +The three module cards don't align cleanly. The Content Engine card with the image is taller than the others. Need consistent heights and the image should be constrained. + +### 4. Terminal section could be stronger +The split layout with "One system. Always running." on the left is good but the terminal on the right is small. The terminal should be more prominent — it's one of the most compelling visual elements. + +### 5. Segments section is too plain +"Artists / Labels / Developers" section is just text. Needs at least subtle visual differentiation — borders, icons, or the hover-inversion effect from the brutalist modules. + +### 6. Footer still says teal brand description +The footer component hasn't been updated. It still references the old copy. + +### 7. The "Human / Machine" toggle is still showing +Need to hide it or make it much more subtle. + +### 8. Subscribe section styling +The subscribe form is on a light background. Should match the dark theme. The form input/button styling likely still uses old colors. + +### 9. Blog post card +The blog post card's styling may clash with the new dark theme. + +### 10. Other pages are broken +Platform, Solutions, Developers, Company pages still have the old design with teal. They'll look completely different from the homepage now. Need to update at minimum the color system so they don't clash. + +## Priority for Loop 2 +1. Fix CSS so entire page defaults to dark +2. Polish the hero typography +3. Fix module grid alignment +4. Update footer +5. Hide Human/Machine toggle +6. Update Header for dark theme diff --git a/.local/ux-audit/loop3-critique.md b/.local/ux-audit/loop3-critique.md new file mode 100644 index 0000000..9773c3b --- /dev/null +++ b/.local/ux-audit/loop3-critique.md @@ -0,0 +1,42 @@ +# Loop 3 Critique + +## Major Improvement +The dark-native design is working. Night-and-day difference from the original teal text site. The yellow accent pops. The hero is bold. The narrative flows: headline → proof → modules → terminal → segments → CTA. + +## Remaining Issues + +### 1. Header logo is using darkmode SVG on dark bg — invisible? +Need to check if the wordmark is visible. The header uses the dark mode SVG when theme is dark — this should be the light/white version for contrast on the dark background. + +### 2. Blog section card styling +The blog post card was designed for light mode. On the dark background, the card border/text colors may not work. Need to verify. + +### 3. Subscribe form styling +The subscribe input and button need to match the dark theme. Currently may have light-mode default styles. + +### 4. Footer colors +Footer text and links may not have enough contrast on dark background. + +### 5. Other pages still broken +Platform, solutions, developers, company pages haven't been updated to the new design language. They'll clash with the homepage when navigated to. + +### 6. The proof strip number +"22" looks great but the text below it could be larger / more readable. + +### 7. Content Engine module image +The content.png has that yellow "MUBI.K" circle on it which looks like gibberish text. Not ideal but acceptable as a placeholder. + +## What's Working Well +- Dark background creates premium feel +- Electric yellow CTA buttons are highly clickable +- Grid background texture is subtle and effective +- Mixed typography in hero is distinctive +- Terminal animation gives "this is real" vibe +- Narrative is clear: one story, not 11 sections +- Footer is clean + +## Next Steps +- Fix header logo contrast +- Update subscribe form for dark theme +- Fix blog card for dark theme +- Update other pages to match (at minimum, they shouldn't crash visually) diff --git a/.local/ux-audit/page-by-page-notes.md b/.local/ux-audit/page-by-page-notes.md new file mode 100644 index 0000000..5a407ff --- /dev/null +++ b/.local/ux-audit/page-by-page-notes.md @@ -0,0 +1,173 @@ +# Page-by-Page Notes + +> Raw notes for each page. What works, what doesn't, what's missing. + +--- + +## Homepage (`/`) + +**First impression (3 seconds):** "Your label. Run by agents." — I think this is a talent agency or management company. The word "agents" is ambiguous. It takes reading the subheader to understand this is AI. The lack of any visual makes it feel like a concept site, not a product. + +**Above the fold:** +- ✅ Headline is bold and memorable +- ✅ Dual CTA (primary + secondary link) is good pattern +- ❌ No visual — hero is 100% text +- ❌ "agents" could mean human agents to a music person +- ❌ Subheader tries to do too much — "strategy, content, fans, revenue" is four separate things + +**Stats section:** +- ✅ "22 videos in one session" is compelling +- ❌ "120k+ tracks hit DSPs daily" is an industry stat, not about Recoupable — confusing placement +- ❌ "$15-25k per hire" only resonates with label operators, not artists +- ❌ No visual context for these numbers + +**Pain section:** +- ✅ "You got into music. You're stuck in ops." is the best line on the site +- ✅ Two-column grid works well +- ❌ Fifth item makes the grid uneven (5 items in 2 cols = orphan) + +**Outcomes section:** +- ✅ "Content that runs" is the strongest outcome — correctly emphasized with accent border +- ❌ "Agents work it. You see the picture." (Catalog) is too vague +- ❌ These descriptions are still abstract — no specifics about what the agent actually produces + +**Differentiators section:** +- ⚠️ "Why not just use ChatGPT?" — risky framing, see audit findings +- ❌ The us-vs-them cards are all text, no visual comparison + +**Use cases section:** +- ✅ Three clear personas +- ❌ No visual differentiation — just three text blocks +- ❌ Artist description buries the lead — "The agent reads your brand docs" should be first + +**How it works:** +- ✅ Three-step numbered flow is clean and effective +- ✅ Numbered circles with black background are visually strong +- ❌ Steps are so high-level they could describe any SaaS product + +**Proof section:** +- ✅ The quote is good — specific, vivid +- ❌ Founder quoting himself weakens social proof +- ❌ No customer testimonial + +**Logos section:** +- ❌ Text-only names look like placeholders +- ❌ No context — "Rostrum Records" means nothing to most visitors + +**Blog section:** +- ❌ One post makes this section feel empty +- ✅ Card design is clean + +**Subscribe section:** +- ✅ Form works, clean design +- ❌ "Music ops. Agent infrastructure." is inside-baseball language + +**Closing section:** +- ✅ Strong closing line +- ❌ Third "Get started" CTA on the page — CTA fatigue + +--- + +## Platform (`/platform`) + +**Overall:** A feature list with no visuals. Reads like product documentation, not a marketing page. + +- ✅ "Deep Artist Context" section explains the moat well +- ✅ "Content Pipeline" section is specific (22 videos, zero editing) +- ✅ "What this is not" section is good differentiation +- ❌ Every section is identical: `

` + `

`. No visual variety. +- ❌ No screenshots of the platform +- ❌ No diagram showing how pieces connect +- ❌ 7 feature sections + 1 "not" section = too much scrolling of identical format +- ❌ Single "Get started" CTA at the bottom — no intermediate CTAs + +**Key question a visitor would ask:** "What does this actually look like when I use it?" + +--- + +## Solutions (`/solutions`) + +**Overall:** Best page on the site conceptually. Per-persona cards with pain → objection → answer is a smart pattern. + +- ✅ Four distinct personas with specific copy +- ✅ Objection handling ("Will it sound like me?") is excellent +- ✅ Answers are specific and concrete +- ❌ All four cards look identical — no icons, no persona imagery +- ❌ No "Book a demo" CTA for enterprise personas +- ❌ Pain text in italic blockquote is subtle — might be skipped +- ❌ The gray background objection box blends into the page + +--- + +## Developers (`/developers`) + +**Overall:** Lists capabilities but doesn't show any code. Developers want to see code before they read marketing copy. + +- ✅ Good section coverage (API, CLI, MCP, Skills, Multi-Model, Sandboxes, Docs) +- ✅ "Not a wrapper around ChatGPT" is the right opening +- ❌ Zero code examples. No `npm install`, no `curl`, no JSON response sample +- ❌ No architecture diagram +- ❌ No mention of rate limits, pricing, or free tier +- ❌ "View docs" as primary CTA sends people away from the site +- ❌ `recoup` CLI mentioned but not linked or shown + +--- + +## Vision (`/company/vision`) + +**Overall:** Five paragraphs of philosophy. Reads like an internal memo, not a public-facing vision page. + +- ✅ "Imagine if a major record label was run by code" is a strong anchor +- ✅ The code-to-music analogy is interesting +- ❌ Pure text wall — no timeline, no roadmap visual, no team photo +- ❌ Feels like reading someone's manifesto, not a company's vision +- ❌ No call to action at the end + +--- + +## About (`/company/about`) + +**Overall:** Feels incomplete. A founder section with no photo and a one-line bio. + +- ✅ Founder bio is authentic and personal +- ❌ No founder photo — misses the human connection +- ❌ No team information (is this a 2-person company? 20-person?) +- ❌ No funding/stage info (early-stage companies should be transparent) +- ❌ Mission statement is a run-on sentence +- ❌ No press mentions, awards, or milestones + +--- + +## Recoupable Records (`/company/recoupable-records`) + +**Overall:** The best story on the site — but told entirely in text with no visual proof. + +- ✅ "We don't just build tools for labels. We are a label." — strongest positioning on the site +- ✅ Gatsby Grace proof point (22 videos, A&R couldn't tell) +- ✅ The branded accent card for the proof point looks good +- ❌ I can't see Gatsby Grace. No link to their music, socials, or any of the 22 videos. +- ❌ No images of any kind +- ❌ The "5-10 human artists + unlimited AI artists" is interesting but feels like fundraise pitch + +--- + +## Blog (`/blog`) + +**Overall:** One post. The page layout is fine — the problem is content volume. + +- ✅ Post card design is clean +- ✅ Tags work +- ❌ One post makes the blog look abandoned +- ❌ Blog description ("Insights on AI-powered music marketing...") promises a lot for one post +- ❌ No sidebar, no categories, no "popular posts" — these are fine for now but will be needed + +--- + +## Learn (`/learn` + subpages) + +**Overall:** Empty placeholder pages. Remove from nav. + +- ❌ Use Cases → "Check back soon" + points to Blog +- ❌ Playbooks → "Check back soon" + points to Blog +- ❌ Demos → "Try the product" link +- ❌ These pages actively hurt credibility by showing there's nothing here diff --git a/.local/ux-audit/rebuild-direction.md b/.local/ux-audit/rebuild-direction.md new file mode 100644 index 0000000..b05d9b8 --- /dev/null +++ b/.local/ux-audit/rebuild-direction.md @@ -0,0 +1,81 @@ +# Rebuild Direction — Blended Design + +> This doc captures the design direction before we build. The critique comes after. + +## What's Wrong Now (User Feedback) + +- Copy is confusing and random — bunch of random phrases +- No clear narrative — nothing ties together +- Nothing clearly explains what the product does +- Design is overly confusing +- The green (teal #345A5D) is hated +- Too many sections, too much text, not enough showing + +## Design Direction: Blend of All Three + +Take the best from each branch variant: + +### From Brutalist +- Mixed typography (pixel font + sans) for visual tension +- Infrastructure-coded modules (INGEST/CREATE/DEPLOY pattern) +- Terminal component with live log animation +- Marquee ticker for energy +- Grid background texture +- Hover-inversion on cards +- Crosshair cursor aesthetic + +### From Editorial +- Framed viewport with thick borders +- Split-panel layout ("The Culture" vs "The Stack") +- Data tables showing real DSP data +- JSON code block showing agent config +- Magazine-meets-terminal aesthetic + +### From Cinematic +- Full black background — dark mode native +- Massive condensed white headlines +- Neon accent color (NOT teal — use electric yellow #c8ff00 or warm white) +- Full-bleed photography with AI bounding boxes +- Terminal analysis readout over images +- "We don't guess hits. We compute them." — confrontational, clear copy + +## Color Decision + +Kill the teal. The new palette: +- Background: black (#000) or near-black (#0a0a0a) +- Foreground: white (#fff) or warm white (#f5f5f0) +- Accent: electric yellow-green (#c8ff00) — used sparingly for status indicators, labels, CTAs +- Muted: gray (#666, #999) +- No teal. No green. No #345A5D. + +## Copy Direction + +The narrative must be ONE clear story, not 11 disjointed sections: + +1. **What we are** (1 sentence): "AI-native label infrastructure." +2. **What that means** (1 sentence): "You create the music. Agents run strategy, content, fans, revenue." +3. **Proof** (1 number): "22 videos. One session. Zero editing." +4. **Three capabilities** (brief): Intel → Content → Deploy +5. **Who it's for** (brief): Artists / Labels / Developers +6. **How to start** (brief): Get an API key or sign up +7. **Closing** (1 line): "Your label. Run by agents." + +That's it. No "Why not just use ChatGPT?" section. No "Hustle by default" philosophy. No differentiator grid. Just: here's what it is, here's proof, here's how to use it. + +## Page Strategy + +- **Homepage**: The cinematic hero + brutalist modules + terminal. Dark. Bold. Clear. +- **Platform**: NOT needed as separate page yet — fold into homepage +- **Solutions**: Simplify to 3 cards on homepage, not a separate page +- **Developers**: Keep as separate page — this audience needs depth +- **Company pages**: Keep but simplify +- **Learn sub-pages**: Remove from nav (empty) + +## What to Keep From Current Site + +- The copy file architecture (lib/copy/*.ts → pages) +- The machine markdown view (useful, just hide the toggle) +- The blog system +- The subscribe form +- The footer structure +- Next.js Image optimization diff --git a/.local/ux-audit/screenshots/about-desktop-full.png b/.local/ux-audit/screenshots/about-desktop-full.png new file mode 100644 index 0000000..55f97d0 Binary files /dev/null and b/.local/ux-audit/screenshots/about-desktop-full.png differ diff --git a/.local/ux-audit/screenshots/blog-desktop-full.png b/.local/ux-audit/screenshots/blog-desktop-full.png new file mode 100644 index 0000000..55f97d0 Binary files /dev/null and b/.local/ux-audit/screenshots/blog-desktop-full.png differ diff --git a/.local/ux-audit/screenshots/developers-desktop-full.png b/.local/ux-audit/screenshots/developers-desktop-full.png new file mode 100644 index 0000000..e1110e8 Binary files /dev/null and b/.local/ux-audit/screenshots/developers-desktop-full.png differ diff --git a/.local/ux-audit/screenshots/home-desktop-full.png b/.local/ux-audit/screenshots/home-desktop-full.png new file mode 100644 index 0000000..39c48c7 Binary files /dev/null and b/.local/ux-audit/screenshots/home-desktop-full.png differ diff --git a/.local/ux-audit/screenshots/home-mobile-full.png b/.local/ux-audit/screenshots/home-mobile-full.png new file mode 100644 index 0000000..130d898 Binary files /dev/null and b/.local/ux-audit/screenshots/home-mobile-full.png differ diff --git a/.local/ux-audit/screenshots/platform-desktop-full.png b/.local/ux-audit/screenshots/platform-desktop-full.png new file mode 100644 index 0000000..c66dbd1 Binary files /dev/null and b/.local/ux-audit/screenshots/platform-desktop-full.png differ diff --git a/.local/ux-audit/screenshots/records-desktop-full.png b/.local/ux-audit/screenshots/records-desktop-full.png new file mode 100644 index 0000000..55f97d0 Binary files /dev/null and b/.local/ux-audit/screenshots/records-desktop-full.png differ diff --git a/.local/ux-audit/screenshots/solutions-desktop-full.png b/.local/ux-audit/screenshots/solutions-desktop-full.png new file mode 100644 index 0000000..e1110e8 Binary files /dev/null and b/.local/ux-audit/screenshots/solutions-desktop-full.png differ diff --git a/.local/ux-audit/screenshots/vision-desktop-full.png b/.local/ux-audit/screenshots/vision-desktop-full.png new file mode 100644 index 0000000..55f97d0 Binary files /dev/null and b/.local/ux-audit/screenshots/vision-desktop-full.png differ diff --git a/.local/ux-audit/teardown-notes.md b/.local/ux-audit/teardown-notes.md new file mode 100644 index 0000000..fcc8c5d --- /dev/null +++ b/.local/ux-audit/teardown-notes.md @@ -0,0 +1,43 @@ +# Teardown Rebuild Notes + +## What Just Happened + +Went from 11 mediocre sections to 5 powerful ones. The headline now fills the entire viewport. The word "MODERN" in a yellow box is the exact technique from the swipe file (mood-9). The IngestFeed replaced the fake chat UI with something that looks like a real system running. The copy went from corporate to confrontational. + +## What's Now GOOD + +1. **The headline is MASSIVE.** "THE LOGIC LAYER FOR MODERN POP CULTURE." — this is a statement, not a heading. The type is uncomfortable big. It demands attention. The mixed pixel-font for the yellow accented word creates real typographic tension. + +2. **System metadata in corners.** "SYS.STATUS: ONLINE" top-left, version number top-right — directly from mood-9 swipe file. No separate StatusBar component needed. Just text in the right place. + +3. **Single CTA.** "INITIALIZE WORKSPACE >" — system-native language, not "Get started." Yellow on black. One button, one action. + +4. **The IngestFeed.** LIVE INGEST FEED with [REC] blinking red, timestamped agent activity, highlighted values in yellow. This looks like a real system monitoring panel. Way better than the fake chat UI. + +5. **"22" proof + founder quote.** Split layout with the massive number on the left and the live feed on the right creates a compelling "proof + system" duality. + +6. **The page is SHORT.** Five sections. Hero, marquee, modules, proof, CTA. Each one has room to breathe. + +7. **"Initialize your workspace."** — perfect closing CTA headline. npm install. Deploy agent. Subscribe. Done. + +## What Still Needs Work + +1. **The pixel font for "MODERN"** — need to verify Silkscreen actually loaded (Google Fonts import). If it didn't, it falls back to system fonts and the effect is lost. + +2. **The modules section** still feels like it could be more visually distinctive. The borders are thin. The hover effect is good but the default state is a bit flat. Consider: thicker top borders in yellow on each card? Or a different background treatment? + +3. **The proof section** — the "22" and the quote are on the left, IngestFeed on the right. This works but the quote text might be too small. And the "WE RUN OUR OWN LABEL ON THIS SYSTEM." line should hit harder. + +4. **Title tag** still says "Recoupable — Your label. Run by agents." — should update to match the new headline. + +5. **Other pages** (platform, solutions, developers, company) are all still the old design. They'll feel completely different when navigated to. At minimum, they need the new color system and typography. + +6. **Mobile** — the massive headline on a 390px screen will need careful responsive handling. It should still be big but probably 2 lines instead of 5. + +## Next Iterations + +- Verify pixel font is loading +- Test mobile responsiveness +- Update metadata/title tag +- Consider: should we add the vision overlay (mood-8) back as a separate standalone section? It was the most distinctive visual in the swipe files but needs a real photo to work. +- The Marquee keywords could be updated — "CONTENT ENGINE" is already in the modules, feels redundant. Consider: real stats or status messages instead of capability keywords? diff --git a/.local/ux-audit/user-stories.md b/.local/ux-audit/user-stories.md new file mode 100644 index 0000000..d4733e6 --- /dev/null +++ b/.local/ux-audit/user-stories.md @@ -0,0 +1,135 @@ +# User Stories — Website Visitor Journeys + +> Who comes to this site, what are they trying to do, and does the site help them do it? + +--- + +## Story 1: Indie Artist — "Should I use this?" + +**Who:** 24-year-old independent artist. Releases on DistroKid. Has 2,000 Instagram followers. Spends 8 hours/week on content. Found Recoupable via an Instagram ad or X post. + +**What they want to know:** +1. What does this actually do for me? +2. What does the product look like? +3. How much does it cost? +4. Can I try it before I pay? +5. Will the output actually be good enough to post? + +**What the site tells them:** +1. ❌ Abstract — "agents run strategy, content, fans, revenue" is vague +2. ❌ Nothing — zero screenshots or demos +3. ❌ Nothing — no pricing page +4. ❓ Unclear — "Get started" might be free, might not +5. ❌ No proof — "22 videos" is a claim with no visual evidence + +**Verdict:** This artist bounces. They don't understand what "agents" means in this context, can't see the product, and don't know if they can afford it. They go back to Canva. + +**What would convert them:** +- A 30-second video showing the agent creating a TikTok post from their music +- "Free to start" on the CTA +- A gallery of example content outputs +- A price (even "$20/month" would answer the question) + +--- + +## Story 2: Label Marketing Manager — "Can this help my team?" + +**Who:** 32-year-old marketing manager at a mid-size label (15 artists). Handles social for all of them. Boss asked her to look into AI tools. She has 20 minutes to evaluate this. + +**What they want to know:** +1. Can this make content for multiple artists with different brands? +2. How much work is setup vs. ongoing use? +3. What does my team need to learn? +4. How much does it cost for a team? +5. Who else is using this? + +**What the site tells them:** +1. ✅ "Context files per artist" answers this, but it's buried in the Solutions page objection section +2. ❌ "Connect your data" is the only setup mention — no details +3. ❌ Nothing about learning curve or onboarding +4. ❌ No pricing — mentions $5-10k but that might be enterprise only +5. ❓ "Trusted by" lists names but no case studies or details + +**Verdict:** She bookmarks it and tells her boss "it might be interesting but I can't tell what it actually costs or looks like." Low confidence to advocate internally. + +**What would convert her:** +- A "For Labels" landing page with a specific use case walkthrough +- Team pricing visible +- A case study or before/after showing content created for a real artist +- A demo video or sandbox she can explore without signing up + +--- + +## Story 3: Label CEO — "Is this worth $10k/month?" + +**Who:** 48-year-old CEO of a label doing $5M/year. Has 40 artists. Heard about Recoupable from a friend (Sid). Looking at the site to decide if it's worth a meeting. + +**What they want to know:** +1. Is this a real company or a side project? +2. What's the ROI — what do I get for $5-10k/month? +3. Who else at my level is using this? +4. What does the implementation look like? +5. Is there enterprise support? + +**What the site tells them:** +1. ❓ The site is professional but sparse — no team page, no funding info, no company size +2. ❌ ROI not quantified — "agents cost a fraction" is vague +3. ❌ Social proof is weak — text names, no logos, no quotes from peers +4. ❌ No implementation timeline or process described +5. ❌ No mention of enterprise support, SLAs, or dedicated onboarding + +**Verdict:** He'd take a meeting based on the personal relationship with Sid, but the website doesn't close the deal or even advance it. If he were evaluating cold (no personal connection), he'd pass. + +**What would convert him:** +- An "Enterprise" section with a "Talk to us" CTA +- Case study with ROI numbers (even anonymized) +- Enterprise features listed (private tenant, dedicated support, custom builds) +- Logos of companies at his level + +--- + +## Story 4: Developer — "Can I build on this?" + +**Who:** 28-year-old developer building a music analytics dashboard. Wants to add AI-powered content creation. Found Recoupable's API docs. + +**What they want to know:** +1. What can the API do? +2. How do I authenticate? +3. Is there a sandbox/free tier? +4. How reliable is it? +5. What's the latency and rate limits? + +**What the site tells them:** +1. ✅ Developer page lists API, CLI, MCP, Skills — but no code examples +2. ❌ "View docs" links out — the marketing site itself has no technical detail +3. ❌ No mention of free tier or developer pricing +4. ❌ No uptime stats or reliability info +5. ❌ No technical specs + +**Verdict:** They click "View docs" and evaluate based on the docs site, not the marketing site. The marketing site adds minimal value for developers. + +**What would convert them:** +- A code snippet on the page (curl example, npm install) +- "Free for developers" or a clear dev tier +- An architecture diagram +- A "Built with Recoupable" showcase + +--- + +## Story 5: Competitor Checking Us Out + +**Who:** Product manager at CoManager, Spincast, or a major-label innovation team. They search "AI music marketing" and find Recoupable. + +**What they're looking for:** +1. What does this company claim to do differently? +2. Do they have real customers? +3. What's their tech stack? +4. Are they a threat? + +**What the site tells them:** +1. ✅ Positioning is clear — "agents that execute, not recommend" +2. ❓ Text-only customer names — hard to verify +3. ❓ Multi-model, MCP, CLI mentioned but no depth +4. ❓ The site feels early-stage, which might make them dismiss it + +**Verdict:** They note the positioning but aren't threatened. The lack of visual proof, thin blog, and empty Learn section signals early stage. diff --git a/AGENTS.md b/AGENTS.md index 7ae4f3f..99c6806 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -46,6 +46,7 @@ apps/ops/ — Internal Next.js app for private marketing workflows content/posts/ — MDX blog posts (one file = one post) content/brand/ — Brand context files (read before creating content) content/seo/ — SEO strategy + keyword targets +content/email/ — Email drip sequence templates for trial onboarding (NOT code — plain text copy) content/STATUS.md — Current state snapshot (read FIRST every session) transcripts/ — Call transcripts (eng, customers, leads) for positioning/copy context swipe/ — Reference material (copy, designs, competitors, complaints, trends) @@ -107,6 +108,16 @@ content/seo/pillars.md — Topic clusters, target keywords content/posts/INDEX.md — Published posts + topic gaps ``` +## Email Drip Sequence + +- **Path:** `content/email/` +- **Purpose:** Plain-text email templates for the free trial onboarding drip campaign. These are **copy documents** — not code, not HTML templates, not blog posts about email. +- **Overview:** `content/email/sequence-overview.md` — the full 30-day sequence plan, timing, and principles +- **What they are:** Each file is one email in the trial onboarding sequence, with the template copy, personalization instructions, and design decision notes +- **What they are NOT:** These are not automated email templates (no HTML/MJML/React Email). They are not marketing content about email. They are not part of the website build. +- **When to read:** If working on trial user communication, onboarding emails, or drip campaign automation +- **Automation:** When ready to automate sending, build that logic in `workflows/` and reference these templates as the source of truth for copy + ## Adding a Blog Post 1. Read `content/STATUS.md` + `content/posts/INDEX.md` (check gaps) diff --git a/apps/web/app/company/about/page.tsx b/apps/web/app/company/about/page.tsx index c20b99a..8736497 100644 --- a/apps/web/app/company/about/page.tsx +++ b/apps/web/app/company/about/page.tsx @@ -9,49 +9,82 @@ export const metadata: Metadata = buildPageMetadata({ path: "/company/about", }); -/** - * Company: About — copy from lib/copy/company (single source for human + machine view). - */ export default function AboutPage() { const c = companyAboutCopy; return ( -

-
+
+ {/* Header */} +
← Company -

+

{c.title}

-

+

{c.description}

-
-

{c.body}

-

+ + {/* Body */} +

+ {c.body} +

+ + {/* Founder */} +
+
+
+ FOUNDER PHOTO +
+
+

+ {c.founder.name} +

+

+ {c.founder.role} +

+

+ {c.founder.bio} +

+
+
+
+ + {/* Mission */} +
+
+

+ Mission +

+

+ {c.mission} +

+
+
+ + {/* Contact */} +
+
{c.contactEmail} - {" "} - ·{" "} + {c.supportEmail} -

-

- {c.legal} -

-
+
+

{c.legal}

+
); } diff --git a/apps/web/app/company/recoupable-records/page.tsx b/apps/web/app/company/recoupable-records/page.tsx index c09e0ad..0eb0dc8 100644 --- a/apps/web/app/company/recoupable-records/page.tsx +++ b/apps/web/app/company/recoupable-records/page.tsx @@ -9,34 +9,85 @@ export const metadata: Metadata = buildPageMetadata({ path: "/company/recoupable-records", }); -/** - * Company: Recoupable Records — copy from lib/copy/company (single source for human + machine view). - */ export default function RecoupableRecordsPage() { const c = companyRecoupableRecordsCopy; return ( -
-
+
+ {/* Header */} +
← Company -

+

{c.title}

-

+

{c.subtitle}

-
-

{c.body}

-

+ + {/* Body */} +

+ {c.body} +

+ + {/* Proof — Gatsby Grace */} +
+
+
+ {/* Image placeholder */} +
+ GATSBY GRACE +
+ {/* Text side */} +
+

+ Case Study +

+

+ {c.proof.headline} +

+

+ {c.proof.description} +

+
+
+
+
+ + {/* Content output — 22 videos */} +
+
+
+ 22 VIDEOS — CONTENT OUTPUT +
+
+

+ 22 videos. One session. Zero manual editing. +

+
+
+
+ + {/* Vision */} +
+
+

+ {c.vision} +

+
+
+ + {/* Footer tagline */} +
+

{c.footer}

-
+
); } diff --git a/apps/web/app/designs/a/page.tsx b/apps/web/app/designs/a/page.tsx new file mode 100644 index 0000000..a16c292 --- /dev/null +++ b/apps/web/app/designs/a/page.tsx @@ -0,0 +1,236 @@ +import Link from "next/link"; + +export default function DesignA() { + return ( + <> + + +
+ {/* Nav */} + + + {/* Hero */} +
+ {/* Pill badge */} + + + Artist Intelligence + + + See how it works → + + + + {/* Headline */} +
+

+ Meet Your New AI +

+

+ Record Label +

+
+ + {/* Subheader */} +

+ Spend more time doing what you love. Let agents handle the rest. +

+ + {/* Chat input */} +
+ + +
+ + {/* Product screenshot placeholder */} +
+ PRODUCT SCREENSHOT +
+
+
+ + ); +} diff --git a/apps/web/app/designs/b/page.tsx b/apps/web/app/designs/b/page.tsx new file mode 100644 index 0000000..82424e4 --- /dev/null +++ b/apps/web/app/designs/b/page.tsx @@ -0,0 +1,206 @@ +import Link from "next/link"; + +export default function DesignB() { + return ( + <> + + +
+ {/* Geometric accent — warm coral circle, partially cropped right */} +
+ {/* Subtle inner ring */} +
+ + {/* Nav */} + + + {/* Hero content */} +
+ {/* Category label */} + + AI-NATIVE INFRASTRUCTURE + + + {/* Headline */} +

+ Your label. +
+ Run by agents. +

+ + {/* Subheader */} +

+ The AI-native record label platform. Agent infrastructure + that handles research, content, distribution, and strategy — + so you can focus on the music. +

+ + {/* CTA */} +
+ + START BUILDING + + + READ THE DOCS → + +
+
+ + {/* Bottom rule */} +
+ © 2026 RECOUPABLE + MUSIC × MACHINE INTELLIGENCE +
+
+ + ); +} diff --git a/apps/web/app/designs/c/page.tsx b/apps/web/app/designs/c/page.tsx new file mode 100644 index 0000000..ca47a73 --- /dev/null +++ b/apps/web/app/designs/c/page.tsx @@ -0,0 +1,265 @@ +import Link from "next/link"; + +export default function DesignC() { + return ( + <> + + +
+ {/* ── LEFT SIDE: Culture ── */} +
+ {/* Top: brand + status */} +
+ + RECOUPABLE + +
+ + + SYSTEM ACTIVE + +
+
+ + {/* Center: headline + subheader */} +
+

+ YOUR LABEL. +
+ RUN BY +
+ AGENTS. +

+ +

+ AI-native infrastructure for record labels. + Research, content, distribution, and strategy — + all handled by autonomous agents. +

+ + + + INITIALIZE SYSTEM + +
+ + {/* Bottom: version bar */} +
+ [VERSION 2.4.1] + DATA_INGEST: OK + LOC: GLOBAL +
+
+ + {/* ── RIGHT SIDE: System ── */} +
+ {/* Top: feed header */} +
+ + LIVE INGEST FEED + + + + REC + +
+ + {/* Center: log entries */} +
+
+ ────────────────────────────────────── +
+
+ 14:02:11 + {" "}SCAN:{" "} + Spotify_Viral_50 +
+
+ 14:02:14 + {" "}DETECTED:{" "} + ISRC_USRC12345678 +
+
+ 14:02:15 + {" "}ANALYZE:{" "}Acoustic_Features... +
+
+ 14:02:18 + {" "}SCORE:{" "} + 87.4{" "} + (HIGH_VELOCITY) +
+
+ 14:02:19 + {" "}ACTION:{" "} + Generate_Contract_Draft +
+
+ 14:02:22 + {" "}ENRICH:{" "}Social_Profiles_Linked +
+
+ 14:02:25 + {" "}STATUS:{" "} + PIPELINE_COMPLETE ✓ +
+
+ ────────────────────────────────────── +
+
+ + {/* Bottom: action button */} +
+ + SUBMIT AUDIO FILE [WAV] + +
+
+
+ + ); +} diff --git a/apps/web/app/designs/d/page.tsx b/apps/web/app/designs/d/page.tsx new file mode 100644 index 0000000..d7136f4 --- /dev/null +++ b/apps/web/app/designs/d/page.tsx @@ -0,0 +1,315 @@ +import Link from "next/link"; + +export default function DesignD() { + return ( + <> + + +
+ {/* Ambient glow spots */} +
+
+ + {/* Top bar */} +
+
+ + Recoupable OS + + | + Workspace 1 +
+ 0.9.4_beta +
+ + {/* Card container — positioned absolutely for layered depth */} +
+ {/* Card 1: LEFT / BEHIND — product nav sidebar */} +
+
+ KNOWLEDGE GRAPH +
+ {[ + { label: "Topic Clusters", active: true }, + { label: "AI Agents", active: false }, + { label: "Research Feed", active: false }, + { label: "Content Pipeline", active: false }, + { label: "Distribution", active: false }, + { label: "Analytics", active: false }, + ].map((item) => ( +
+ {item.label} +
+ ))} +
+ + {/* Card 2: CENTER / FRONT — hero message */} +
+ {/* Subtle top accent line */} +
+ +
+ AGENT WORKSPACE +
+ +

+ Connect your artist. +
+ Analyze the data. +
+ Deploy the system. +

+ +

+ Your AI-native record label — research, content, and + distribution agents working autonomously. +

+ + + Initialize Workspace → + +
+ + {/* Card 3: RIGHT / BEHIND — system operations */} +
+
+ SYSTEM OPERATIONS +
+ + {[ + { num: "01", label: "COLLECT", desc: "Ingest streams + social" }, + { num: "02", label: "PROCESS", desc: "Score and rank signals" }, + { num: "03", label: "GENERATE", desc: "Deploy content agents" }, + ].map((step) => ( +
+
+ + {step.num} + + + {step.label} + +
+

+ {step.desc} +

+
+ ))} +
+
+
+ + ); +} diff --git a/apps/web/app/designs/e/page.tsx b/apps/web/app/designs/e/page.tsx new file mode 100644 index 0000000..ac7899c --- /dev/null +++ b/apps/web/app/designs/e/page.tsx @@ -0,0 +1,239 @@ +import Link from "next/link"; + +export default function DesignE() { + return ( + <> + + +
+ {/* Radial amber glow from center */} +
+ + {/* System metadata — top corners */} +
+ + + SYS.ONLINE + +
+
+ + v2.1 + +
+ + {/* Hero content */} +
+ {/* Brand name */} + + Recoupable + + + {/* Headline: serif italic × sans bold */} +

+ + Your + {" "} + + label. + +
+ + Run by + {" "} + + agents. + +

+ + {/* Subheader — Sidney's story feel */} +

+ I built the label I wished existed when I was an artist — + one where agents handle research, content, and distribution + so you never have to choose between creating and running a business. +

+ + {/* CTAs */} +
+ + Start Free + + + Watch Demo + +
+
+ + {/* Bottom ambient line */} +
+
+ + ); +} diff --git a/apps/web/app/developers/page.tsx b/apps/web/app/developers/page.tsx index 84004fc..ac7d867 100644 --- a/apps/web/app/developers/page.tsx +++ b/apps/web/app/developers/page.tsx @@ -1,4 +1,5 @@ import type { Metadata } from "next"; +import Image from "next/image"; import Link from "next/link"; import { developersCopy } from "@/lib/copy/developers"; import { buildPageMetadata } from "@/lib/seo"; @@ -9,60 +10,110 @@ export const metadata: Metadata = buildPageMetadata({ path: "/developers", }); -/** - * Developers page — copy from lib/copy/developers (single source for human + machine view). - */ +const sectionIcons: Record = { + api: "{ }", + cli: ">_", + mcp: "⇄", + skills: "◈", + "multi-model": "◉", + sandboxes: "▢", + docs: "📖", +}; + export default function DevelopersPage() { const c = developersCopy; return ( -
-
-

+
+ {/* Hero */} +
+

+ Developer Platform +

+

{c.title}

-

+

{c.description}

-
- {c.sections.map((section) => ( -
-

- {section.title} -

-

- {section.description} -

- {"linkLabel" in section && section.linkLabel && section.linkHref && ( - - {section.linkLabel} → - - )} -
- ))} -
+ {/* Code snippet */} +
+
+
+ + + + + terminal + +
+
+            
+              ${" "}
+              npm install{" "}
+              -g{" "}
+              recoup
+              {"\n"}
+              ${" "}
+              recoup research{" "}
+              "Drake"{" "}
+              --platform{" "}
+              spotify
+            
+          
+
+
+ + {/* Sections grid */} +
+
+ {c.sections.map((section) => ( +
+
+ + {sectionIcons[section.id] ?? "•"} + +

+ {section.title} +

+
+

+ {section.description} +

+ {"linkLabel" in section && + section.linkLabel && + section.linkHref && ( + + {section.linkLabel} → + + )} +
+ ))} +
+
-
+ {/* CTA */} +
- {c.ctaLabel} + {c.ctaLabel} → -
+
); } diff --git a/apps/web/app/globals.css b/apps/web/app/globals.css index 01f9085..9eccdee 100644 --- a/apps/web/app/globals.css +++ b/apps/web/app/globals.css @@ -1,41 +1,42 @@ +@import url('https://fonts.googleapis.com/css2?family=Silkscreen&display=swap'); +@import url('https://fonts.googleapis.com/css2?family=Instrument+Serif:ital@0;1&display=swap'); @import "tailwindcss"; @plugin "@tailwindcss/typography"; -/* - * Recoupable marketing — global styles - * Design: .impeccable.md (teal primary #345A5D, tinted neutrals, no pure B/W) - */ - @layer base { :root { - /* Brand primary — AGENTS.md #345A5D as OKLCH (teal/slate) */ - --brand: oklch(0.42 0.04 198); - --brand-hover: oklch(0.36 0.04 198); - --brand-muted: oklch(0.55 0.03 198); - - /* Surfaces — tinted toward brand hue (198), no pure white/black */ - --background: oklch(0.99 0.004 198); - --foreground: oklch(0.22 0.02 198); - --muted: oklch(0.97 0.006 198); - --muted-foreground: oklch(0.48 0.02 198); - --border: oklch(0.91 0.01 198); - } - - /* - * Dark mode — override the same variables. Edit only this block to change dark theme. - * Applied when (set by ThemeProvider + inline script). - * Neutral black/gray (chroma 0 = no green/teal tint). - */ + --brand: #d4a843; + --brand-hover: #c49a3a; + --brand-muted: #a8863a; + --background: #0c0f14; + --foreground: #f0ebe3; + --muted: #141820; + --muted-foreground: #8a8680; + --border: #252a35; + --font-pixel: 'Silkscreen', cursive; + --font-serif: 'Instrument Serif', Georgia, serif; + } + [data-theme="dark"] { - --background: oklch(0.14 0 0); - --foreground: oklch(0.97 0 0); - --muted: oklch(0.2 0 0); - --muted-foreground: oklch(0.62 0 0); - --border: oklch(0.28 0 0); - /* Brand accent stays teal for contrast on dark */ - --brand: oklch(0.55 0.05 198); - --brand-hover: oklch(0.62 0.05 198); - --brand-muted: oklch(0.45 0.04 198); + --brand: #d4a843; + --brand-hover: #c49a3a; + --brand-muted: #a8863a; + --background: #0c0f14; + --foreground: #f0ebe3; + --muted: #141820; + --muted-foreground: #8a8680; + --border: #252a35; + } + + [data-theme="light"] { + --background: #f8f5f0; + --foreground: #1a1815; + --muted: #f0ece5; + --muted-foreground: #6b6560; + --border: #ddd8d0; + --brand: #b8922e; + --brand-hover: #a07d25; + --brand-muted: #c4a048; } body { @@ -49,16 +50,22 @@ } } -/* Fluid type scale — clamp for readable hierarchy */ +/* ── Fluid type scale ──────────────────────────────────────── */ @layer base { .text-display { - font-size: clamp(2.25rem, 5vw + 1.5rem, 3.5rem); - line-height: 1.1; - letter-spacing: -0.02em; + font-size: clamp(2.5rem, 6vw + 1.5rem, 5rem); + line-height: 0.95; + letter-spacing: -0.03em; + } + + .text-hero { + font-size: clamp(3.5rem, 12vw, 10rem); + line-height: 0.9; + letter-spacing: -0.03em; } .text-lead { - font-size: clamp(1.125rem, 1.5vw + 0.75rem, 1.25rem); + font-size: clamp(1.125rem, 1.5vw + 0.75rem, 1.375rem); line-height: 1.5; } @@ -68,23 +75,108 @@ } } -/* Spacing rhythm — fluid section padding */ +/* ── Editorial serif (warm, italic, musical) ──────────────── */ +@layer components { + .text-editorial { + font-family: var(--font-serif); + font-style: italic; + font-weight: 400; + letter-spacing: -0.02em; + } +} + +/* ── Mixed typography (serif italic × sans bold tension) ──── */ +@layer components { + .mix { + text-transform: uppercase; + letter-spacing: -0.03em; + line-height: 0.9; + } + + .mix .s { + font-family: var(--font-serif); + font-style: italic; + font-weight: 400; + text-transform: none; + } + + .mix .p { + font-family: var(--font-display), system-ui, sans-serif; + font-weight: 900; + color: var(--foreground); + } +} + +/* ── Section spacing ───────────────────────────────────────── */ @layer utilities { .section-spacing { - padding-top: clamp(3rem, 8vw, 5rem); - padding-bottom: clamp(3rem, 8vw, 5rem); + padding-top: clamp(3rem, 6vw, 5rem); + padding-bottom: clamp(3rem, 6vw, 5rem); } .section-spacing-tight { - padding-top: clamp(2rem, 5vw, 3rem); - padding-bottom: clamp(2rem, 5vw, 3rem); + padding-top: clamp(1.5rem, 4vw, 2.5rem); + padding-bottom: clamp(1.5rem, 4vw, 2.5rem); + } +} + +/* ── Grid background texture (warm navy) ──────────────────── */ +@layer utilities { + .grid-bg { + background-size: 60px 60px; + background-image: linear-gradient(to right, rgba(240, 235, 227, 0.03) 1px, transparent 1px), + linear-gradient(to bottom, rgba(240, 235, 227, 0.03) 1px, transparent 1px); + } +} + +/* ── Marquee animation ─────────────────────────────────────── */ +@keyframes marquee { + from { + transform: translateX(0); + } + to { + transform: translateX(-50%); + } +} + +.animate-marquee { + animation: marquee 30s linear infinite; +} + +/* ── Blink animation ───────────────────────────────────────── */ +@keyframes blink { + 0%, 100% { + opacity: 1; + } + 50% { + opacity: 0; } } -/* Prose / typography overrides for blog content */ +.animate-blink { + animation: blink 1s step-end infinite; +} + +/* ── Code block styling ────────────────────────────────────── */ +@layer components { + .code-block { + background: var(--muted); + border: 1px solid rgba(212, 168, 67, 0.25); + border-radius: 6px; + padding: 0.75rem 1rem; + font-family: ui-monospace, "SFMono-Regular", "SF Mono", Menlo, Consolas, monospace; + font-size: 0.875rem; + color: var(--brand); + overflow-x: auto; + } +} + +/* ── Prose / typography overrides ──────────────────────────── */ @layer components { .prose { --tw-prose-links: var(--brand); + --tw-prose-body: var(--foreground); + --tw-prose-headings: var(--foreground); } .prose a { @@ -96,3 +188,73 @@ text-decoration-color: var(--brand); } } + +/* ── Scroll-driven fade-in (progressive enhancement) ──────── */ +@keyframes fade-in-up { + from { + opacity: 0; + transform: translateY(20px); + } + to { + opacity: 1; + transform: translateY(0); + } +} + +@supports (animation-timeline: view()) { + .fade-in-up { + animation: fade-in-up linear both; + animation-timeline: view(); + animation-range: entry 0% entry 30%; + } +} + +/* ── Brand accent glow (amber) ────────────────────────────── */ +.glow-brand { + box-shadow: 0 0 40px rgba(212, 168, 67, 0.1), 0 0 80px rgba(212, 168, 67, 0.05); +} + +/* ── Noise texture (film grain) ───────────────────────────── */ +.noise-overlay::after { + content: ""; + position: absolute; + inset: 0; + pointer-events: none; + z-index: 1; + opacity: 0.03; + background-image: url("data:image/svg+xml,%3Csvg viewBox='0 0 256 256' xmlns='http://www.w3.org/2000/svg'%3E%3Cfilter id='noise'%3E%3CfeTurbulence type='fractalNoise' baseFrequency='0.9' numOctaves='4' stitchTiles='stitch'/%3E%3C/filter%3E%3Crect width='100%25' height='100%25' filter='url(%23noise)'/%3E%3C/svg%3E"); + background-repeat: repeat; + background-size: 256px 256px; +} + +/* ── Glass-morphism card (warm tint) ─────────────────────── */ +.glass-card { + background: rgba(240, 235, 227, 0.03); + backdrop-filter: blur(12px); + -webkit-backdrop-filter: blur(12px); + border: 1px solid rgba(240, 235, 227, 0.06); + border-radius: 12px; +} + +/* ── Stagger delays for child elements ────────────────────── */ +.stagger-1 { animation-delay: 0.1s; } +.stagger-2 { animation-delay: 0.2s; } +.stagger-3 { animation-delay: 0.3s; } +.stagger-4 { animation-delay: 0.4s; } +.stagger-5 { animation-delay: 0.5s; } + +/* ── Scan lines (CRT/monitor effect) ──────────────────────── */ +.scan-lines::before { + content: ""; + position: absolute; + inset: 0; + pointer-events: none; + z-index: 2; + background: repeating-linear-gradient( + 0deg, + transparent, + transparent 2px, + rgba(0, 0, 0, 0.03) 2px, + rgba(0, 0, 0, 0.03) 4px + ); +} diff --git a/apps/web/app/layout.tsx b/apps/web/app/layout.tsx index ba2eeb2..3b81f3e 100644 --- a/apps/web/app/layout.tsx +++ b/apps/web/app/layout.tsx @@ -71,7 +71,7 @@ export default function RootLayout({ {/* Theme: set before first paint to avoid flash (must match ThemeContext logic) */}