Skip to content

OpenKnots/agent-browser-automation

Repository files navigation

Agent Browser Automation

Next.js + Kernel for running AI-powered browser automations with natural language.

./assets/home.png

Overview

This shows how to:

  • Create serverless browsers with live view using the Kernel SDK
  • Describe browser tasks in natural language
  • Use an AI agent to execute browser automation code via AI SDK tools in Next.js API routes
  • Display live browser view and automation results in a modern Next.js UI

Tech Stack

  • Framework: Next.js 15 with App Router
  • Styling: Tailwind CSS v4
  • UI Components: shadcn/ui
  • AI: Vercel AI SDK with OpenAI GPT-5.1
  • Browser Automation: Kernel SDK + Kernel AI SDK (@onkernel/ai-sdk)
  • Package Manager: Bun
  • Deployment: Vercel

Getting Started

Prerequisites

  • Node.js 18+
  • Bun (package manager)
  • A Kernel account and API key
  • An OpenAI API key
  • Vercel account (optional, for deployment)

Installation

  1. Clone the repository:

    git clone <your-repo-url>
    cd nextjs-kernel-template
  2. Install dependencies:

    bun install
  3. Set up Kernel:

    Get your Kernel API key from one of these sources:

  4. Configure environment variables:

    Create a .env file:

    touch .env.local

    Add your API keys:

    KERNEL_API_KEY=your_kernel_api_key_here
    OPENAI_API_KEY=your_openai_api_key_here
    
  5. Run the development server:

    bun dev
  6. Open http://localhost:3000 in your browser

How It Works

  1. Create Browser: Click "Create Browser" to provision a serverless Kernel browser with live view capabilities
  2. Describe Your Task: Enter what you want the browser to do in natural language (e.g., "Go to Hacker News and get the top article title")
  3. Watch AI Execute: The AI agent interprets your task and uses Kernel's AI SDK-compatible browser automation tool to execute it in real-time
  4. View Results: See the agent's response, step count, and click "View Steps" to inspect the generated code and execution details

Code Structure

app/
├── api/
│   ├── agent/
│   │   └── route.ts          # AI agent endpoint with browser automation tool
│   ├── create-browser/
│   │   └── route.ts          # Creates a serverless Kernel browser
│   └── delete-browser/
│       └── route.ts          # Closes browser session
├── page.tsx                  # Main UI with live view and controls
├── layout.tsx                # Root layout
└── globals.css               # Global styles

components/
├── Header.tsx                # App header with branding
├── StepsOverlay.tsx          # Modal showing agent execution steps
└── ui/                       # shadcn/ui components
    ├── button.tsx
    ├── card.tsx
    ├── textarea.tsx
    └── ...

lib/
└── utils.ts                  # Utility functions

Key Code Example

Step 1: Create Browser (app/api/create-browser/route.ts)

import { Kernel } from "@onkernel/sdk";

// Initialize Kernel client
const kernel = new Kernel({ apiKey: process.env.KERNEL_API_KEY });

// Create a serverless browser with live view
const browser = await kernel.browsers.create({
  stealth: true,
  headless: false,
});

// Return browser details to client
return {
  sessionId: browser.session_id,
  liveViewUrl: browser.browser_live_view_url,
  cdpWsUrl: browser.cdp_ws_url,
};

Step 2: Run AI Agent (app/api/agent/route.ts)

import { openai } from "@ai-sdk/openai";
import { playwrightExecuteTool } from "@onkernel/ai-sdk";
import { Kernel } from "@onkernel/sdk";
import { Experimental_Agent as Agent, stepCountIs } from "ai";

// Initialize Kernel instance
const kernel = new Kernel({ apiKey: process.env.KERNEL_API_KEY });

// Initialize the AI agent with GPT-5.1 and Kernel's AI SDK-compatible browser automation tool
const agent = new Agent({
  model: openai("gpt-5.1"),
  tools: {
    playwright_execute: playwrightExecuteTool({
      client: kernel,
      sessionId: sessionId,
    }),
  },
  stopWhen: stepCountIs(20),
  system: `You are a browser automation expert with access to a Playwright execution tool...`,
});

// Execute the agent with the user's task
const { text, steps } = await agent.generate({
  prompt: task, // e.g., "Go to news.ycombinator.com and get the first article title"
});

Deployment

Deploy to Vercel

  1. Push to GitHub

  2. Connect to Vercel:

    • Go to vercel.com
    • Import your GitHub repository
    • Add your environment variables (KERNEL_API_KEY and OPENAI_API_KEY)
    • Deploy
  3. Using Vercel Marketplace Integration:

    • Install Kernel from Vercel Marketplace
    • The integration will automatically add the Kernel API key to your project
    • Add your OPENAI_API_KEY manually
    • Deploy your project

Environment Variables

Make sure to add these environment variables in your Vercel project settings:

  • KERNEL_API_KEY - Your Kernel API key
  • OPENAI_API_KEY - Your OpenAI API key

Learn More


Built with Kernel, Vercel AI SDK, and Vercel

About

Control your browser with natural language.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published