Skip to content

webfuse-com/extension-vapi-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vapi Voice Agent

Webfuse Extension Vapi

A voice-powered AI assistant that can see and interact with any web page through Webfuse. Users click a floating orb to start a conversation, and the assistant can click buttons, fill forms, scroll, navigate, and more — all by voice.

Vapi Web SDK + Webfuse Automation API


How It Works

User speaks ──> Vapi (voice + LLM) ──> Tool call ──> Webfuse Automation API ──> Page action
  1. The extension renders a floating orb widget in a Webfuse Session
  2. User clicks the orb to start a voice call with a Vapi assistant
  3. The assistant has access to browser automation tools (click, type, scroll, etc.)
  4. Tools are executed via the Webfuse Automation API, which interacts with the live page

Available Tools

Tool Description
take_dom_snapshot Read the page structure (HTML with Webfuse IDs for targeting)
click_element Click a button, link, or any element
type_text Type into input fields
press_key Press keyboard keys (Enter, Escape, Tab, etc.)
navigate_to Navigate to a URL

Quick Start

Prerequisites

1. Clone & Install

git clone https://github.com/webfuse-com/extension-vapi-voice-agent.git
cd extension-vapi-voice-agent
pnpm install

2. Build

pnpm build

This bundles the extension into the dist/ folder using Vite, producing:

dist/
  background.js    # Minimal background script
  content.js       # Automation API relay
  popup.html       # Orb widget UI
  popup.js         # Vapi SDK + tools (bundled)
  manifest.json    # Extension manifest

3. Upload to Webfuse

  1. Go to Webfuse Studio and create a Space
  2. Open Settings (gear icon) > Extensions
  3. Click Upload Extension and select the dist/ folder
  4. The extension is now installed in your Space

4. Configure API Keys

You can set your Vapi credentials in two ways:

Option A: In the manifest (before building)

Edit manifest.json and set the value fields:

{
  "env": [
    {
      "key": "VAPI_PUBLIC_KEY",
      "value": "your-vapi-public-key",
      "description": "Your Vapi public API key from the Vapi Dashboard"
    },
    {
      "key": "VAPI_ASSISTANT_ID",
      "value": "your-assistant-id",
      "description": "The Vapi assistant ID to use for voice conversations"
    }
  ]
}

Then rebuild with pnpm build.

Option B: In Webfuse Studio (recommended)

After uploading the extension:

  1. Open Settings (gear icon) > Extensions
  2. Click on Vapi Voice Widget
  3. Click Configure next to Environment Variables
  4. Set VAPI_PUBLIC_KEY and VAPI_ASSISTANT_ID

This approach keeps your keys out of the code and lets you change them without rebuilding.


Setting Up Vapi

Create an Assistant

  1. Sign in to the Vapi Dashboard
  2. Create a new Assistant
  3. Copy the Assistant ID — this is your VAPI_ASSISTANT_ID
  4. Copy your Public Key from the dashboard — this is your VAPI_PUBLIC_KEY

The extension uses an inline assistant configuration with client-side tools. If you set a VAPI_ASSISTANT_ID, the extension will use your pre-configured assistant. The inline config includes a system prompt and tools for web automation — see src/tools.ts to customize.


Project Structure

vapi-voice-widget/
  src/
    popup.ts        # Orb UI + Vapi SDK + tool call handling
    tools.ts        # Webfuse Automation tool definitions
    content.ts      # Automation API relay (runs in page context)
    background.ts   # Auto-opens the popup on session start
    types/
      webfuse.d.ts  # TypeScript types for Webfuse extension APIs
  popup.html        # Orb widget HTML + CSS (animations, glow states)
  manifest.json     # Extension manifest (permissions, env vars)
  build.ts          # Vite build script (bundles each entry as IIFE)
  package.json
  tsconfig.json

Architecture

The extension uses three contexts, each with a specific role:

Context File Role
Popup popup.ts Runs the Vapi SDK (voice + audio), handles tool calls, renders the orb UI. Popup has WebRTC access and respects host_permissions for CSP.
Content content.ts Thin relay that executes Webfuse Automation API calls on the page. Reloads on navigation.
Background background.ts Auto-opens the popup when the session starts.

Why the popup? In Webfuse, host_permissions do not apply to content scripts (to preserve the proxied page's CSP). The popup is the only context that has both unrestricted network access (for Daily.co WebRTC) and full browser APIs (microphone, audio). See REPORT.md for the full investigation.


Customization

Changing the Voice

Edit src/popup.ts and change the voice provider/ID in vapi.start():

voice: { provider: "vapi", voiceId: "Elliot" },

See Vapi Voice Options for available voices.

Adding or Modifying Tools

Edit src/tools.ts to add new automation tools or modify existing ones. Each tool needs:

  1. A handler in handleToolCall() that calls the Webfuse Automation API
  2. A Vapi tool definition in the vapiTools array

See the Webfuse Automation API Reference for all available methods.

Changing the System Prompt

Edit the systemPrompt export in src/tools.ts to change how the assistant behaves.


Further Reading

About

Webfuse Extension: Vapi Voice Agent

Topics

Resources

Stars

Watchers

Forks

Contributors