diff --git a/.husky/pre-commit b/.husky/pre-commit
index e2fc3f1..80a5efd 100755
--- a/.husky/pre-commit
+++ b/.husky/pre-commit
@@ -4,7 +4,7 @@
 #
 
 # Add homebrew to PATH for non-interactive shells
-export PATH="/opt/homebrew/bin:$PATH"
+export PATH="/Users/s/Library/pnpm:/opt/homebrew/bin:$PATH"
 
 echo "🔍 Running Gitleaks to check for secrets..."
 
diff --git a/BETA-RELEASE.md b/BETA-RELEASE.md
new file mode 100644
index 0000000..956169a
--- /dev/null
+++ b/BETA-RELEASE.md
@@ -0,0 +1,82 @@
+# Beta Release Reel
+
+Five headline features shipping in the Copilot SDK beta.
+
+---
+
+## 1. Skills System
+
+On-demand instruction playbooks the AI loads at runtime. Keep prompts lean — behavior is injected only when relevant.
+
+|         | Path                                       |
+| ------- | ------------------------------------------ |
+| Docs    | `apps/docs/content/docs/skills/index.mdx`  |
+|         | `apps/docs/content/docs/skills/client.mdx` |
+|         | `apps/docs/content/docs/skills/server.mdx` |
+| Example | `examples/skills-demo/`                    |
+
+**Highlights:** Three strategies (eager, auto, manual) — inline + file + URL sources — `defineSkill()` helper — collision detection — `load_skill` tool auto-registered.
+
+---
+
+## 2. Generative UI
+
+AI renders structured UI components — cards, tables, stat tiles, HTML — inline inside the chat. Not just text.
+
+|         | Path                                       |
+| ------- | ------------------------------------------ |
+| Docs    | `apps/docs/content/docs/generative-ui.mdx` |
+| Example | `examples/generative-ui-demo/`             |
+
+**Highlights:** Built-in renderers (Card, Table, Stat, Html) — `useGenerativeUI()` hook — `toolRenderers` map for custom components — iframe sandbox for AI-generated HTML.
+
+---
+
+## 3. Fallback Chain & Routing
+
+Chain multiple LLM providers with automatic failover. One runtime, multiple models, zero downtime.
+
+|         | Path                                            |
+| ------- | ----------------------------------------------- |
+| Docs    | `apps/docs/content/docs/providers/fallback.mdx` |
+| Example | `examples/fallback-demo/`                       |
+
+**Highlights:** Priority + round-robin routing — per-model retry with exponential backoff — `onFallback` / `onRetry` callbacks — pluggable `RoutingStore` for serverless (Redis, KV).
+
+---
+
+## 4. File Attachments
+
+Drag-and-drop file and media attachments in chat. Upload, preview, and forward to the server runtime.
+
+|         | Path                                                                               |
+| ------- | ---------------------------------------------------------------------------------- |
+| Docs    | `apps/docs/content/docs/chat/attachments.mdx`                                      |
+| Example | Used in `examples/skills-demo/` and `examples/ollama-demo/` (no dedicated example) |
+
+**Highlights:** `AttachmentStrip` thumbnail preview — drop-zone overlay — `useAttachments()` for headless access — automatic server forwarding — error handling built-in.
+
+---
+
+## 5. Knowledge Base
+
+Connect external knowledge sources so the AI can search and cite real data in responses.
+
+|         | Path                            |
+| ------- | ------------------------------- |
+| Docs    | Not yet documented              |
+| Example | `examples/knowledge-base-demo/` |
+
+**Highlights:** RAG-powered retrieval — plug into any vector store — cite sources in responses — works with the agentic loop for multi-step research.
+
+---
+
+## Status
+
+| Feature          | Docs | Example | README |          Ready          |
+| ---------------- | :--: | :-----: | :----: | :---------------------: |
+| Skills System    | Yes  |   Yes   |   No   |         Refine          |
+| Generative UI    | Yes  |   Yes   |   No   |         Refine          |
+| Fallback Chain   | Yes  |   Yes   |  Yes   |         Refine          |
+| File Attachments | Yes  | Partial |   No   | Needs dedicated example |
+| Knowledge Base   |  No  | Minimal |   No   |  Needs docs + example   |
diff --git a/apps/docs/alpha-docs/BRANCHING.md b/apps/docs/alpha-docs/BRANCHING.md
deleted file mode 100644
index efd266a..0000000
--- a/apps/docs/alpha-docs/BRANCHING.md
+++ /dev/null
@@ -1,450 +0,0 @@
-# Conversation Branching
-
-> Branch `feat/branching` — implements the same UX pattern as ChatGPT, Claude.ai, and Gemini:
-> editing a user message creates a parallel conversation path, preserving the original,
-> with `← N/M →` navigation between variants.
-
----
-
-## Table of Contents
-
-1. [Live Demo](#live-demo)
-2. [What Was Built](#what-was-built)
-3. [Breaking Changes](#breaking-changes)
-4. [New APIs](#new-apis)
-5. [Database / Persistence Changes](#database--persistence-changes)
-6. [User Adoption](#user-adoption)
-7. [Framework-Agnostic Usage](#framework-agnostic-usage)
-8. [How It Works Internally](#how-it-works-internally)
-
----
-
-## Live Demo
-
-A full working demo is in the **experimental** examples project.
-
-**Location:** `examples/experimental/`
-**Route:** `/branching`
-
-```bash
-cd examples/experimental
-pnpm dev
-# → http://localhost:3000/branching
-```
-
-### What the demo shows
-
-Two-panel layout inside a single `CopilotProvider`:
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│  ← Back   Conversation Branching Demo   [feat/branching]    │
-├──────────────────────────┬──────────────────────────────────┤
-│  Branch Tree             │  CopilotChat                    │
-│                          │                                  │
-│  Branch Tree             │  [user: Hello]  ← 1/2 →         │
-│  4 total · 3 visible     │  [assistant: Hi there]          │
-│  branched ✦              │                                  │
-│                          │  [user: Tell me more] ✏         │
-│  ● U Hello               │  [assistant: Sure…]             │
-│  ├── ● A Hi there  ×2   │                                  │
-│  └── · A Hey             │  ────────────────────────────── │
-│      └── ● U Tell me…   │  [input field]                  │
-│          └── ● A Sure…  │                                  │
-└──────────────────────────┴──────────────────────────────────┘
-```
-
-- **Left panel** (`BranchTreePanel`) — reads `getAllMessages()` live. Green dot = on active path, grey = inactive branch. `×N` badge = sibling count. Click any node to call `switchBranch()`.
-- **Right panel** — standard `CopilotChat`. Edit ✏ button appears on hover over user messages. `← N/M →` navigator appears below user messages when variants exist.
-
-### Demo source files
-
-| File                                                             | Purpose                                    |
-| ---------------------------------------------------------------- | ------------------------------------------ |
-| `examples/experimental/app/branching/page.tsx`                   | Page: `CopilotProvider` + two-panel layout |
-| `examples/experimental/components/branching/BranchTreePanel.tsx` | Live tree visualization component          |
-| `examples/experimental/app/api/chat/branching/route.ts`          | Anthropic API route (haiku, short replies) |
-
-### Key code pattern in the demo
-
-```tsx
-// page.tsx — both panels share one CopilotProvider
-<CopilotProvider runtimeUrl="/api/chat/branching">
-  <BranchTreePanel />   {/* reads getAllMessages(), calls switchBranch() */}
-  <CopilotChat ... />   {/* edit button + BranchNavigator built-in */}
-</CopilotProvider>
-
-// BranchTreePanel.tsx — the core hook usage
-const { messages, getAllMessages, getBranchInfo, switchBranch, hasBranches } = useCopilot();
-const allMessages = getAllMessages();          // all branches
-const visibleIds = new Set(messages.map(m => m.id)); // active path
-```
-
----
-
-## What Was Built
-
-### Core Data Layer
-
-| File                                   | What Changed                                                                                                                                                                                                          |
-| -------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `src/chat/branching/MessageTree.ts`    | **New.** Pure TypeScript tree utility. Bidirectional flat-map: `parentId` + `childrenIds[]` + `activeChildMap`. No React dependency.                                                                                  |
-| `src/chat/branching/index.ts`          | **New.** Barrel export.                                                                                                                                                                                               |
-| `src/chat/types/message.ts`            | Added `parentId?: string \| null` and `childrenIds?: string[]` to `UIMessage`.                                                                                                                                        |
-| `src/core/types/message.ts`            | Added `parent_id?: string \| null` and `children_ids?: string[]` to `Message` (persistence layer).                                                                                                                    |
-| `src/chat/interfaces/ChatState.ts`     | Added 5 optional branching methods: `setCurrentLeaf`, `getAllMessages`, `getBranchInfo`, `switchBranch`, `hasBranches`.                                                                                               |
-| `src/react/internal/ReactChatState.ts` | Replaced `_messages: T[]` array with `MessageTree<T>`. `messages` getter = visible path only.                                                                                                                         |
-| `src/chat/classes/AbstractChat.ts`     | `regenerate()` rewritten to be branch-aware (creates sibling instead of destroying). `sendMessage()` extended with `options.editMessageId`. `onMessagesChange` callback now passes all branches via `_allMessages()`. |
-| `src/chat/ChatWithTools.ts`            | `sendMessage()` passes through `options.editMessageId`.                                                                                                                                                               |
-
-### React Layer
-
-| File                                       | What Changed                                                                                                                              |
-| ------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------- |
-| `src/react/internal/ReactChat.ts`          | Added `switchBranch`, `getBranchInfo`, `getAllMessages`, `hasBranches` pass-throughs.                                                     |
-| `src/react/internal/ReactChatWithTools.ts` | Same pass-throughs.                                                                                                                       |
-| `src/react/internal/useChat.ts`            | Added `switchBranch`, `getBranchInfo`, `editMessage`, `hasBranches` to `UseChatReturn`.                                                   |
-| `src/react/context/CopilotContext.tsx`     | Added branching methods to `ChatActions`.                                                                                                 |
-| `src/react/provider/CopilotProvider.tsx`   | Wired branching methods into context. `onMessagesChange` effect uses `getAllMessages()`. Added `getAllMessages` to `CopilotContextValue`. |
-| `src/react/index.ts`                       | Re-exports `MessageTree`, `BranchInfo`.                                                                                                   |
-| `src/chat/index.ts`                        | Re-exports `MessageTree`, `BranchInfo`.                                                                                                   |
-
-### UI Layer
-
-| File                                                  | What Changed                                                                                                                         |
-| ----------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
-| `src/ui/components/ui/branch-navigator.tsx`           | **New.** `← N/M →` purely presentational component.                                                                                  |
-| `src/ui/components/composed/chat/types.ts`            | Added `getBranchInfo`, `onSwitchBranch`, `onEditMessage` to `ChatProps`.                                                             |
-| `src/ui/components/composed/chat/default-message.tsx` | User messages: pencil edit button on hover, inline textarea edit, `BranchNavigator` shown when siblings exist.                       |
-| `src/ui/components/composed/chat/chat.tsx`            | Passes branch props through to each message.                                                                                         |
-| `src/ui/components/composed/connected-chat.tsx`       | Pulls `switchBranch`, `getBranchInfo`, `editMessage` from `useCopilot()` and passes to `<Chat />`.                                   |
-| `src/ui/hooks/useInternalThreadManager.ts`            | Save path uses `getAllMessages()`. Load paths restore `parentId`/`childrenIds`. `convertToCore` includes `parent_id`/`children_ids`. |
-| `src/ui/index.ts`                                     | Exports `BranchNavigator`, `BranchNavigatorProps`.                                                                                   |
-
----
-
-## Breaking Changes
-
-**None.**
-
-All new fields and methods are optional. Every existing usage continues to work without modification:
-
-| Scenario                                | Behavior                                                                                                                                       |
-| --------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
-| Messages with no `parentId`             | `getVisibleMessages()` falls back to insertion order (legacy linear)                                                                           |
-| `regenerate()` called without arguments | Finds last assistant on visible path — identical to before                                                                                     |
-| `sendMessage()` with no third argument  | Identical to before                                                                                                                            |
-| `useChat()` / `useCopilot()` consumers  | All branching fields available but optional — no existing destructuring breaks                                                                 |
-| `onMessagesChange` callback consumers   | Now receives all branches instead of visible path only — **payload size may increase** if branches exist, but shape is identical (`Message[]`) |
-| DB rows with no `parent_id` column      | Auto-migrated via `fromFlatArray()` on load — no manual migration script needed for existing data                                              |
-
-> **Note on `onMessagesChange` payload:** If a user has branched the conversation, the callback now receives all messages across all branches (not just the active path). The shape is the same `Message[]` type. If your persistence layer deduplicates by message ID, no change is needed. If it blindly appends, you may want to upsert by ID instead.
-
----
-
-## New APIs
-
-### `useCopilot()` / `CopilotProvider`
-
-```typescript
-const {
-  switchBranch, // (messageId: string) => void
-  getBranchInfo, // (messageId: string) => BranchInfo | null
-  editMessage, // (messageId: string, newContent: string) => Promise<void>
-  hasBranches, // boolean — true if any fork exists
-  getAllMessages, // () => UIMessage[] — all branches, not just visible path
-} = useCopilot();
-```
-
-### `useChat()`
-
-```typescript
-const {
-  switchBranch,   // (messageId: string) => void
-  getBranchInfo,  // (messageId: string) => BranchInfo | null
-  editMessage,    // (messageId: string, newContent: string) => Promise<void>
-  hasBranches,    // boolean
-} = useChat({ ... });
-```
-
-### `<Chat />` props
-
-```typescript
-<Chat
-  getBranchInfo={(messageId) => BranchInfo | null}
-  onSwitchBranch={(messageId) => void}
-  onEditMessage={(messageId, newContent) => void}
-/>
-```
-
-### `BranchInfo` type
-
-```typescript
-interface BranchInfo {
-  siblingIndex: number; // 0-based — which variant this is
-  totalSiblings: number; // how many variants exist at this fork
-  siblingIds: string[]; // ordered oldest-first
-  hasPrevious: boolean;
-  hasNext: boolean;
-}
-```
-
-### `BranchNavigator` component (UI primitives)
-
-```tsx
-import { BranchNavigator } from "@yourgpt/copilot-sdk-ui";
-
-<BranchNavigator
-  siblingIndex={info.siblingIndex}
-  totalSiblings={info.totalSiblings}
-  hasPrevious={info.hasPrevious}
-  hasNext={info.hasNext}
-  onPrevious={() => switchBranch(info.siblingIds[info.siblingIndex - 1])}
-  onNext={() => switchBranch(info.siblingIds[info.siblingIndex + 1])}
-/>;
-```
-
-### `MessageTree` (framework-agnostic)
-
-```typescript
-import { MessageTree, type BranchInfo } from "@yourgpt/copilot-sdk";
-
-const tree = new MessageTree(messages);
-tree.getVisibleMessages(); // active path only
-tree.getAllMessages(); // all branches
-tree.getBranchInfo(messageId); // BranchInfo | null
-tree.switchBranch(messageId);
-tree.hasBranches; // boolean
-```
-
----
-
-## Database / Persistence Changes
-
-### New columns needed
-
-Two new optional columns on your messages table:
-
-```sql
-ALTER TABLE messages
-  ADD COLUMN parent_id TEXT REFERENCES messages(id),
-  ADD COLUMN children_ids JSONB DEFAULT '[]';
-```
-
-| Column         | Type                    | Nullable | Description                                                   |
-| -------------- | ----------------------- | -------- | ------------------------------------------------------------- |
-| `parent_id`    | `TEXT` / `VARCHAR`      | YES      | ID of parent message. `NULL` = root. Missing = legacy linear. |
-| `children_ids` | `JSON` array of strings | YES      | Ordered child IDs for O(1) sibling lookup.                    |
-
-> **These columns are optional.** Existing rows without them are auto-migrated to a linear tree on load via `fromFlatArray()`. No data loss. No required migration for existing rows.
-
-### What gets saved now
-
-When `onMessagesChange` fires (or the thread manager auto-saves), the payload contains **all messages across all branches**, not just the visible path. Each message carries:
-
-```json
-{
-  "id": "msg-abc",
-  "role": "assistant",
-  "content": "...",
-  "parent_id": "msg-xyz",
-  "children_ids": []
-}
-```
-
-### What gets loaded
-
-When a thread is loaded (auto-restore or `switchThread`), the SDK maps:
-
-```
-DB row.parent_id     → UIMessage.parentId
-DB row.children_ids  → UIMessage.childrenIds
-```
-
-The `MessageTree` is rebuilt from these fields. The last child at each fork becomes the active path (matches what was active when saved).
-
-### localStorage (built-in persistence)
-
-No changes needed. The SDK's `localStorageAdapter` serializes the full `Thread` object including messages. The new fields are automatically included when present.
-
-### Server persistence (`serverAdapter`)
-
-Your API endpoints that receive `PUT /threads/:id` payloads will now see `parent_id` and `children_ids` on each message object. Store them as-is. If your schema doesn't have these columns yet, the fields are simply ignored — no error.
-
-### Upsert strategy (recommended)
-
-Since branched conversations can have multiple messages with the same `parent_id`, always **upsert by message ID** rather than replacing the array:
-
-```typescript
-// ✅ Safe for branching
-await db.messages.upsert({ id: msg.id, ...msg });
-
-// ⚠️ Loses inactive branches
-await db.threads.update({ messages: visibleMessages });
-```
-
----
-
-## User Adoption
-
-### Zero-config (CopilotChat users)
-
-If you use `<CopilotChat />`, branching is **already active**. No code changes needed.
-
-- Edit button appears on hover over any user message
-- `← 1/2 →` navigator appears below user messages when variants exist
-- Regenerate creates a branch instead of overwriting
-
-### Manual wiring (`<Chat />` users)
-
-Wire the three props from `useCopilot()`:
-
-```tsx
-function MyChat() {
-  const { switchBranch, getBranchInfo, editMessage } = useCopilot();
-
-  return (
-    <Chat
-      getBranchInfo={getBranchInfo}
-      onSwitchBranch={switchBranch}
-      onEditMessage={editMessage}
-    />
-  );
-}
-```
-
-### Custom message renderers
-
-If you render messages manually, use `getBranchInfo` + `BranchNavigator`:
-
-```tsx
-function MyMessage({ message }) {
-  const { switchBranch, getBranchInfo } = useCopilot();
-  const info = message.role === "user" ? getBranchInfo(message.id) : null;
-
-  return (
-    <div>
-      <p>{message.content}</p>
-      {info && (
-        <BranchNavigator
-          {...info}
-          onPrevious={() =>
-            switchBranch(info.siblingIds[info.siblingIndex - 1])
-          }
-          onNext={() => switchBranch(info.siblingIds[info.siblingIndex + 1])}
-        />
-      )}
-    </div>
-  );
-}
-```
-
-### Programmatic branching
-
-```typescript
-// Edit a message (creates new branch from same parent)
-await editMessage("msg-abc", "Updated question text");
-
-// Navigate between variants
-switchBranch("msg-xyz");
-
-// Check if branches exist
-if (hasBranches) {
-  const info = getBranchInfo("msg-abc");
-  // info.totalSiblings, info.siblingIndex, etc.
-}
-
-// Persist all branches (not just visible path)
-const allMessages = getAllMessages();
-await saveToServer(allMessages);
-```
-
----
-
-## Framework-Agnostic Usage
-
-All branching primitives are exported from the core package (no React required):
-
-```typescript
-import { MessageTree, type BranchInfo } from "@yourgpt/copilot-sdk";
-
-// Build a tree from saved messages
-const tree = new MessageTree(savedMessages);
-
-// Get what to send to the AI (active path only)
-const apiMessages = tree.getVisibleMessages();
-
-// Get everything to persist
-const allMessages = tree.getAllMessages();
-
-// Navigate
-tree.switchBranch(messageId);
-const info = tree.getBranchInfo(messageId); // BranchInfo | null
-
-// Migrate legacy flat arrays
-const linked = MessageTree.fromFlatArray(legacyMessages);
-```
-
----
-
-## How It Works Internally
-
-### Data structure
-
-Each message carries two optional fields:
-
-```
-parentId: string | null | undefined
-  null      = root message (first in conversation)
-  undefined = legacy linear message (pre-branching)
-  string    = ID of parent message
-
-childrenIds: string[]
-  Ordered list of direct child IDs (oldest-first)
-```
-
-The `MessageTree` maintains three maps:
-
-| Map              | Key                      | Value           | Purpose                           |
-| ---------------- | ------------------------ | --------------- | --------------------------------- |
-| `nodeMap`        | messageId                | Message         | O(1) message lookup               |
-| `childrenOf`     | parentId (or `__root__`) | `string[]`      | All children at a fork            |
-| `activeChildMap` | parentId                 | active child ID | Which branch is currently visible |
-
-### Regenerate flow
-
-```
-Before:  user → assistant-A
-                    ↑ currentLeaf
-
-1. setCurrentLeaf(user.id)   → rewind to user
-2. processRequest()          → AI generates assistant-B
-3. addMessage(assistant-B)   → becomes active child of user
-
-After:   user → assistant-A  (inactive, navigable via ←)
-              ↘ assistant-B  (active)
-```
-
-### Edit flow
-
-```
-Before:  user-A → assistant-A
-
-1. sendMessage("new text", { editMessageId: "user-A" })
-2. newParentId = user-A.parentId (= null, root)
-3. setCurrentLeaf(null)          → rewind to before user-A
-4. create user-B with parentId=null
-5. processRequest()              → AI generates assistant-B
-
-After:  user-A → assistant-A  (inactive)
-        user-B → assistant-B  (active)
-```
-
-### Visible path vs all messages
-
-```
-getAllMessages()       → every message across every branch (for persistence)
-getVisibleMessages()  → root → currentLeaf along activeChildMap (for UI + API)
-```
-
-The API always receives `getVisibleMessages()`. Inactive branches are never sent to the model.
diff --git a/apps/docs/alpha-docs/CHAT-PRIMITIVES.md b/apps/docs/alpha-docs/CHAT-PRIMITIVES.md
deleted file mode 100644
index d63ac02..0000000
--- a/apps/docs/alpha-docs/CHAT-PRIMITIVES.md
+++ /dev/null
@@ -1,219 +0,0 @@
-# Chat Primitives
-
-> `release/alpha` — ships two complementary APIs for headless chat customization: the `ChatPrimitives` namespace (low-level building blocks) and compound components on `CopilotChat.*` (MessageActions, MessageList, DefaultMessage, etc.). Both are non-breaking additive exports.
-
----
-
-## Table of Contents
-
-1. [What Was Built](#what-was-built)
-2. [Breaking Changes](#breaking-changes)
-3. [ChatPrimitives Namespace](#chatprimitives-namespace)
-4. [CopilotChat Compound Components](#copilotchat-compound-components)
-5. [Usage Examples](#usage-examples)
-6. [How It Works Internally](#how-it-works-internally)
-7. [Relation to `messageView`](#relation-to-messageview)
-
----
-
-## What Was Built
-
-Two exports that let you compose custom chat UIs at any level of abstraction while the SDK handles all state, streaming, and context internally.
-
-**`ChatPrimitives`** — a named export of individual low-level components. Useful when you import under an alias and want to pick specific pieces.
-
-**`CopilotChat.*` compound extensions** — the same primitives accessible directly on the `CopilotChat` component for inline composition without extra imports.
-
----
-
-## Breaking Changes
-
-**None.** Both are purely additive. Existing `<CopilotChat />` usage is untouched.
-
----
-
-## ChatPrimitives Namespace
-
-```tsx
-import { ChatPrimitives as Chat } from "@yourgpt/copilot-sdk/ui";
-```
-
-### All Primitives
-
-| Primitive             | Description                                               |
-| --------------------- | --------------------------------------------------------- |
-| `Chat.MessageList`    | Render-prop message list — reads `messages` from context  |
-| `Chat.DefaultMessage` | Full SDK message bubble — use as fallback in custom lists |
-| `Chat.Header`         | Chat header bar                                           |
-| `Chat.Welcome`        | Welcome screen shown when there are no messages           |
-| `Chat.Input`          | Composer / input box                                      |
-| `Chat.ScrollAnchor`   | Auto-scroll anchor, place at end of message list          |
-| `Chat.Message`        | Low-level message row wrapper                             |
-| `Chat.MessageAvatar`  | Avatar with fallback initials                             |
-| `Chat.MessageContent` | Content bubble — renders markdown, supports streaming     |
-| `Chat.MessageActions` | Action bar layout primitive (wraps action buttons)        |
-| `Chat.MessageAction`  | Single action icon button with tooltip                    |
-| `Chat.Loader`         | Streaming / thinking indicator                            |
-
-### `Chat.MessageList` props
-
-```ts
-interface MessageListProps {
-  children?: (message: ChatMessage, index: number) => React.ReactNode;
-  className?: string;
-}
-```
-
-When `children` is provided, called once per message — return your custom component or fall back to `Chat.DefaultMessage`. When omitted, renders all messages with `DefaultMessage`.
-
----
-
-## CopilotChat Compound Components
-
-The `ChatPrimitives` are also mounted on the `CopilotChat` export:
-
-```tsx
-import { CopilotChat } from "@yourgpt/copilot-sdk/ui";
-
-CopilotChat.MessageActions; // compound action registrar (see MESSAGE-ACTIONS.md)
-CopilotChat.CopyAction; // built-in copy button
-CopilotChat.EditAction; // built-in inline edit button
-CopilotChat.FeedbackAction; // built-in thumbs up/down
-CopilotChat.Action; // custom action button
-```
-
-These are the action-registration compound components — see [MESSAGE-ACTIONS.md](./MESSAGE-ACTIONS.md) for full docs.
-
----
-
-## Usage Examples
-
-### Custom message type with fallback
-
-```tsx
-import { ChatPrimitives as Chat } from "@yourgpt/copilot-sdk/ui";
-
-<CopilotChat>
-  <Chat.MessageList>
-    {(message) =>
-      message.metadata?.type === "plan" ? (
-        <PlanCard key={message.id} message={message} />
-      ) : (
-        <Chat.DefaultMessage key={message.id} message={message} />
-      )
-    }
-  </Chat.MessageList>
-</CopilotChat>;
-```
-
----
-
-### Fully custom layout — compose from scratch
-
-```tsx
-import { ChatPrimitives as Chat } from "@yourgpt/copilot-sdk/ui";
-
-<CopilotChat>
-  <div className="flex flex-col h-full">
-    <Chat.Header />
-    <Chat.Welcome />
-
-    <div className="flex-1 overflow-y-auto px-4">
-      <Chat.MessageList>
-        {(message) => (
-          <Chat.Message key={message.id} message={message}>
-            <Chat.MessageAvatar message={message} />
-            <Chat.MessageContent message={message} />
-          </Chat.Message>
-        )}
-      </Chat.MessageList>
-      <Chat.Loader />
-      <Chat.ScrollAnchor />
-    </div>
-
-    <Chat.Input />
-  </div>
-</CopilotChat>;
-```
-
----
-
-### Mix primitives with message actions
-
-```tsx
-import { ChatPrimitives as Chat } from "@yourgpt/copilot-sdk/ui";
-
-<CopilotChat>
-  {/* Register floating action buttons */}
-  <CopilotChat.MessageActions role="assistant">
-    <CopilotChat.CopyAction />
-    <CopilotChat.FeedbackAction onFeedback={(msg, type) => log(msg.id, type)} />
-  </CopilotChat.MessageActions>
-
-  {/* Custom message list */}
-  <Chat.MessageList>
-    {(message) =>
-      message.metadata?.type === "approval" ? (
-        <ApprovalCard key={message.id} message={message} />
-      ) : (
-        <Chat.DefaultMessage key={message.id} message={message} />
-      )
-    }
-  </Chat.MessageList>
-</CopilotChat>;
-```
-
----
-
-### Per-message action buttons (using primitives directly)
-
-```tsx
-import { ChatPrimitives as Chat } from "@yourgpt/copilot-sdk/ui";
-
-<Chat.MessageList>
-  {(message) => (
-    <Chat.Message key={message.id} message={message}>
-      <Chat.MessageAvatar message={message} />
-      <div className="flex flex-col gap-1">
-        <Chat.MessageContent message={message} />
-        <Chat.MessageActions>
-          <Chat.MessageAction
-            icon={<CopyIcon />}
-            tooltip="Copy"
-            onClick={() => navigator.clipboard.writeText(message.content ?? "")}
-          />
-        </Chat.MessageActions>
-      </div>
-    </Chat.Message>
-  )}
-</Chat.MessageList>;
-```
-
----
-
-## How It Works Internally
-
-**State access:** `Chat.MessageList` reads `messages` and `registeredTools` from `CopilotChatInternalContext` — the same context `chat.tsx` already provides. No extra wiring needed.
-
-**`messages` + `registeredTools` in context:** Added to `CopilotChatInternalContext` so primitives can access them without prop drilling. `connected-chat.tsx` was unchanged — values flow through the existing context setup in `chat.tsx`.
-
-**Files created/modified:**
-
-- `message-list.tsx` _(new)_ — `Chat.MessageList` component
-- `chat.tsx` — added `messages` + `registeredTools` to `CopilotChatInternalContext`; extended `Chat` compound object with `MessageActions`, `CopyAction`, `EditAction`, `FeedbackAction`, `Action`
-- `ui/index.ts` — added `ChatPrimitives` export
-- `chat/index.ts` — added `MessageList`, all action compound types
-
----
-
-## Relation to `messageView`
-
-`messageView` prop (see [CUSTOM-MESSAGE-VIEW.md](./CUSTOM-MESSAGE-VIEW.md)) and `Chat.MessageList` solve the same use case — custom message rendering — at different abstraction levels:
-
-|             | `messageView`                                            | `Chat.MessageList`                            |
-| ----------- | -------------------------------------------------------- | --------------------------------------------- |
-| Style       | Prop on `<CopilotChat>`                                  | Child component inside `<CopilotChat>`        |
-| Access      | `messages[]` + pre-rendered `messageElements[]`          | `messages[]` via render-prop                  |
-| When to use | Quick overrides, inject extra UI around existing renders | Full layout control, building from primitives |
-
-Both are non-breaking and can coexist. `messageView` remains the simpler option for most cases.
diff --git a/apps/docs/alpha-docs/CONTEXT-MANAGEMENT.md b/apps/docs/alpha-docs/CONTEXT-MANAGEMENT.md
deleted file mode 100644
index 78bb8a4..0000000
--- a/apps/docs/alpha-docs/CONTEXT-MANAGEMENT.md
+++ /dev/null
@@ -1,733 +0,0 @@
-# Context Management
-
-Advanced context window management for the YourGPT Copilot SDK. These features give you full control over what the AI sees, how long conversations stay alive, and how tokens are tracked and budgeted.
-
----
-
-## Table of Contents
-
-1. [Dual-Layer Message Store](#1-dual-layer-message-store)
-2. [Message History & Compaction](#2-message-history--compaction)
-   - [Compaction Strategies](#compaction-strategies)
-   - [Config Reference](#config-reference)
-3. [Token Counting](#3-token-counting)
-4. [Session Persistence](#4-session-persistence)
-5. [useContextStats](#5-usecontextstats)
-6. [AgentLoop API](#6-agentloop-api)
-7. [Tools — useTool / useTools / ToolDefinition](#7-tools--usetool--usetools--tooldefinition)
-   - [Deferred Tools](#deferred-tools)
-   - [Hidden Tools](#hidden-tools)
-   - [Fallback Tool Renderer](#fallback-tool-renderer)
-8. [Message Grouping](#8-message-grouping)
-9. [Server: compactSession](#9-server-compactsession)
-
----
-
-## 1. Dual-Layer Message Store
-
-Every conversation maintains two parallel views of the message history.
-
-| Layer                 | Type               | Purpose                                                                           |
-| --------------------- | ------------------ | --------------------------------------------------------------------------------- |
-| **Display layer**     | `DisplayMessage[]` | Full immutable history. Rendered in the UI. Never shrinks.                        |
-| **LLM context layer** | `LLMMessage[]`     | Compacted/pruned form sent to the model on each request. Rebuilt on every render. |
-
-### Types
-
-```typescript
-// Display layer — extends UIMessage for full backward-compat
-interface DisplayMessage extends UIMessage {
-  timestamp: number; // Unix ms
-}
-
-// Injected into displayMessages when compaction fires
-interface CompactionMarker extends DisplayMessage {
-  role: "system";
-  type: "compaction-marker";
-  content: string; // Human-readable summary
-  summarizedMessageIds: string[];
-  tokensSaved: number;
-}
-
-// LLM context layer — what the model actually sees
-interface LLMMessage {
-  role: "system" | "user" | "assistant" | "tool";
-  content: string;
-  tool_calls?: ToolCall[];
-  tool_call_id?: string;
-}
-
-// Replaces a full tool result when old enough to prune
-interface CompactedToolResult {
-  type: "compacted-tool-result";
-  toolName: string;
-  toolCallId: string;
-  args: Record<string, unknown>;
-  executedAt: number;
-  status: "success" | "error";
-  originalSize: number;
-  summary: string;
-  extract?: string; // First 200 chars if no LLM summary
-}
-```
-
-### Conversion helpers
-
-```typescript
-import {
-  toDisplayMessage,
-  toLLMMessage,
-  toLLMMessages,
-  keepToolPairsAtomic,
-} from "@yourgpt/copilot-sdk-react";
-```
-
-`keepToolPairsAtomic` ensures that when you slice a window, an `assistant` message with `tool_calls` is never separated from its corresponding tool-result messages.
-
----
-
-## 2. Message History & Compaction
-
-### useMessageHistory
-
-```typescript
-import { useMessageHistory } from "@yourgpt/copilot-sdk-react";
-
-function MyChat() {
-  const {
-    displayMessages, // Full UI history
-    llmMessages, // Compacted LLM context
-    tokenUsage, // Live token estimate
-    isCompacting, // true while auto-compaction runs
-    compactionState, // Metadata & rolling summary
-    compactSession, // Manual trigger
-    addToWorkingMemory,
-    clearWorkingMemory,
-    resetSession,
-  } = useMessageHistory({
-    strategy: "summary-buffer",
-    maxContextTokens: 128000,
-    compactionThreshold: 0.75,
-    compactionUrl: "/api/compact",
-    persistSession: true,
-  });
-}
-```
-
-#### Return type
-
-```typescript
-interface UseMessageHistoryReturn {
-  displayMessages: DisplayMessage[];
-  llmMessages: LLMMessage[];
-  tokenUsage: TokenUsage;
-  isCompacting: boolean;
-  compactionState: SessionCompactionState;
-  compactSession: (instructions?: string) => Promise<void>;
-  addToWorkingMemory: (fact: string) => void;
-  clearWorkingMemory: () => void;
-  resetSession: () => void;
-}
-```
-
-### Compaction Strategies
-
-Four strategies are available via the `strategy` config field.
-
-#### `"none"` (default)
-
-No compaction. Zero-config, 100% backward-compatible. All messages sent verbatim.
-
-```typescript
-useMessageHistory({ strategy: "none" });
-```
-
-#### `"sliding-window"`
-
-Keeps only the most recent N tokens of history. Oldest messages are dropped when the token budget is exceeded.
-
-```typescript
-useMessageHistory({
-  strategy: "sliding-window",
-  maxContextTokens: 128000,
-  reserveForResponse: 4096,
-  recentBuffer: 10, // Always keep at least 10 recent messages
-  toolResultMaxChars: 10000, // Truncate large tool results
-});
-```
-
-#### `"selective-prune"`
-
-Removes tool-result messages that are older than `recentBuffer`, keeping the conversation skeleton (user/assistant turns) intact. Lighter than sliding-window — no token counting required.
-
-```typescript
-useMessageHistory({
-  strategy: "selective-prune",
-  recentBuffer: 10,
-});
-```
-
-#### `"summary-buffer"`
-
-Summarizes old messages into a rolling summary when usage exceeds `compactionThreshold`. The summary is injected into the LLM context as a system message. Requires a `/api/compact` endpoint (or custom `summarizer`).
-
-```typescript
-useMessageHistory({
-  strategy: "summary-buffer",
-  compactionThreshold: 0.75, // Compact at 75% of maxContextTokens
-  compactionUrl: "/api/compact",
-  recentBuffer: 10,
-  onCompaction: (event) => {
-    console.log(
-      `Compacted ${event.messagesSummarized} messages, saved ~${event.tokensSaved} tokens`,
-    );
-  },
-});
-```
-
-Custom summarizer (skip the HTTP round-trip):
-
-```typescript
-useMessageHistory({
-  strategy: "summary-buffer",
-  summarizer: async (messages) => {
-    const res = await myLLM.summarize(messages);
-    return res.text;
-  },
-});
-```
-
-### Config Reference
-
-```typescript
-interface MessageHistoryConfig {
-  strategy?: "none" | "sliding-window" | "summary-buffer" | "selective-prune";
-  maxContextTokens?: number; // default: 128000
-  reserveForResponse?: number; // default: 4096
-  compactionThreshold?: number; // default: 0.75
-  recentBuffer?: number; // default: 10
-  toolResultMaxChars?: number; // default: 10000 (0 = no cap)
-  compactionUrl?: string; // required for summary-buffer
-  persistSession?: boolean; // default: false
-  storageKey?: string; // default: "copilot-session"
-  onCompaction?: (event: CompactionEvent) => void;
-  onTokenUsage?: (usage: TokenUsage) => void;
-}
-```
-
-#### Per-call options
-
-```typescript
-interface UseMessageHistoryOptions extends MessageHistoryConfig {
-  skipCompaction?: boolean;
-  tokenEstimation?: "fast" | "accurate" | "off"; // default: "fast"
-  summarizer?: (messages: LLMMessage[]) => Promise<string>;
-}
-```
-
-### Provider-level config
-
-Set defaults once in `<CopilotProvider>` instead of each `useMessageHistory` call:
-
-```tsx
-<CopilotProvider
-  messageHistory={{
-    strategy: "summary-buffer",
-    maxContextTokens: 128000,
-    compactionUrl: "/api/compact",
-    persistSession: true,
-  }}
->
-  <App />
-</CopilotProvider>
-```
-
-### Working Memory
-
-Pin facts that survive all future compactions:
-
-```typescript
-const { addToWorkingMemory, clearWorkingMemory } = useMessageHistory({ ... });
-
-// Survives compaction
-addToWorkingMemory("User is on the Pro plan. Account ID: acct_123");
-
-// Remove all pinned facts
-clearWorkingMemory();
-```
-
-### Compaction event & token usage types
-
-```typescript
-interface CompactionEvent {
-  type: "auto" | "manual";
-  compactionCount: number;
-  messagesSummarized: number;
-  tokensSaved: number;
-  timestamp: number;
-}
-
-interface TokenUsage {
-  current: number; // Estimated tokens in LLM context
-  max: number; // maxContextTokens
-  percentage: number; // current / max (0–1)
-  isApproaching: boolean; // percentage >= compactionThreshold
-}
-
-interface SessionCompactionState {
-  rollingSummary: string | null;
-  lastCompactionAt: number | null;
-  compactionCount: number;
-  totalTokensSaved: number;
-  workingMemory: string[];
-  displayMessageCount: number;
-  llmMessageCount: number;
-}
-```
-
----
-
-## 3. Token Counting
-
-Two-tier estimation — pick the right trade-off between speed and accuracy.
-
-### Tier 1: Fast (zero dependencies)
-
-Uses a `chars / 3.5` heuristic. ~85–90% accurate for English. Always available, no bundle cost.
-
-```typescript
-import {
-  estimateTokensFast,
-  estimateMessageTokens,
-  estimateMessagesTokens,
-} from "@yourgpt/copilot-sdk-react";
-
-const tokens = estimateTokensFast("Hello world"); // fast, synchronous
-const msgTokens = estimateMessagesTokens(llmMessages);
-```
-
-### Tier 2: Accurate (lazy-loaded)
-
-Uses `gpt-tokenizer` with the `o200k_base` encoding. Lazy-loaded only when called — no upfront bundle cost. Falls back to Tier 1 if `gpt-tokenizer` is not installed.
-
-```typescript
-import {
-  countTokensAccurate,
-  countMessagesTokensAccurate,
-} from "@yourgpt/copilot-sdk-react";
-
-// Only loads gpt-tokenizer on first call
-const tokens = await countTokensAccurate("Hello world");
-const msgTokens = await countMessagesTokensAccurate(llmMessages);
-```
-
-### Dispatcher
-
-```typescript
-import { estimateTokens } from "@yourgpt/copilot-sdk-react";
-import type { TokenEstimationMode } from "@yourgpt/copilot-sdk-react";
-
-// mode: "fast" | "accurate" | "off"
-const tokens = estimateTokens(llmMessages, "fast");
-```
-
-Set via `tokenEstimation` in `useMessageHistory`:
-
-```typescript
-useMessageHistory({ tokenEstimation: "accurate" });
-```
-
----
-
-## 4. Session Persistence
-
-Survive page reloads with zero extra code.
-
-```typescript
-useMessageHistory({
-  persistSession: true,
-  storageKey: "my-app-chat", // default: "copilot-session"
-});
-```
-
-| What is persisted                  | Where                                                      |
-| ---------------------------------- | ---------------------------------------------------------- |
-| `compactionState` (small metadata) | `localStorage` — sync, available immediately on cold start |
-| `displayMessages` (can be large)   | `IndexedDB` — async, avoids localStorage quota issues      |
-
-Both are keyed by `storageKey`. Multiple chat instances can coexist with different keys.
-
-Clear everything (including storage) with:
-
-```typescript
-const { resetSession } = useMessageHistory({ persistSession: true });
-await resetSession();
-```
-
----
-
-## 5. useContextStats
-
-Live snapshot of context window usage. Updates reactively on every message send.
-
-```typescript
-import { useContextStats } from "@yourgpt/copilot-sdk-react";
-
-function ContextMonitor() {
-  const {
-    contextUsage,        // Full breakdown by bucket (richest field)
-    totalTokens,         // Convenience: total estimated tokens
-    usagePercent,        // Convenience: window fill 0–1
-    contextChars,        // Characters contributed by AI context injections
-    toolCount,           // Number of currently registered tools
-    messageCount,        // Visible (non-system) messages
-    lastResponseUsage,   // Token usage from last assistant message
-  } = useContextStats();
-
-  // Breakdown by bucket
-  const historyTokens = contextUsage?.breakdown.history.tokens;
-  const systemPercent = contextUsage?.breakdown.systemPrompt.percent;
-
-  return (
-    <div>
-      <p>{Math.round(usagePercent * 100)}% of context used</p>
-      <p>{totalTokens} tokens / {toolCount} tools</p>
-      {lastResponseUsage && (
-        <p>Last turn: {lastResponseUsage.total_tokens} tokens</p>
-      )}
-    </div>
-  );
-}
-```
-
-### Return type
-
-```typescript
-interface ContextStats {
-  contextUsage: ContextUsage | null; // null until first message
-  totalTokens: number;
-  usagePercent: number; // 0 until first message
-  contextChars: number;
-  toolCount: number;
-  messageCount: number;
-  lastResponseUsage: MessageTokenUsage | null;
-}
-
-interface MessageTokenUsage {
-  prompt_tokens: number;
-  completion_tokens: number;
-  total_tokens: number;
-}
-```
-
----
-
-## 6. AgentLoop API
-
-`AbstractAgentLoop` is the framework-agnostic core that manages the tool execution loop, approvals, and cancellation.
-
-```typescript
-import { AbstractAgentLoop } from "@yourgpt/copilot-sdk";
-
-const loop = new AbstractAgentLoop(
-  {
-    maxIterations: 20,
-    tools: [myTool],
-  },
-  {
-    onToolExecutionsChange: (executions) => setExecutions(executions),
-    onToolApprovalRequired: (execution) => showApprovalModal(execution),
-  },
-);
-
-// Register/unregister tools at runtime
-loop.registerTool(weatherTool);
-loop.unregisterTool("old_tool");
-
-// Execute tool calls returned by the LLM
-const results = await loop.executeToolCalls(toolCallsFromLLM);
-
-// Cancel in-flight execution
-loop.cancel();
-```
-
-### Config
-
-```typescript
-interface AgentLoopConfig {
-  maxIterations?: number; // default: 20
-  maxExecutionHistory?: number; // default: 100
-  tools?: ToolDefinition[];
-}
-```
-
-Tools use reference counting so React StrictMode double-invocations don't leave orphaned registrations.
-
----
-
-## 7. Tools — useTool / useTools / ToolDefinition
-
-### useTool
-
-Register a single client-side tool from a React component. Accepts both Zod schemas and JSON Schema.
-
-```typescript
-import { useTool } from "@yourgpt/copilot-sdk-react";
-import { z } from "zod";
-
-function MyComponent() {
-  useTool({
-    name: "navigate_to_page",
-    description: "Navigate to a page in the app",
-    inputSchema: z.object({
-      path: z.string().describe("Route path to navigate to"),
-    }),
-    handler: async ({ path }) => {
-      router.push(path);
-      return { success: true };
-    },
-    // Optional UI rendering
-    render: ({ args, result }) => <NavigationCard path={args.path} />,
-  });
-}
-```
-
-### useTools (ToolSet pattern)
-
-Register multiple tools at once using the Vercel AI SDK `ToolSet` pattern:
-
-```typescript
-import { useTools, tool } from "@yourgpt/copilot-sdk-react";
-
-function MyApp() {
-  useTools({
-    get_weather: tool({
-      description: "Get weather for a location",
-      inputSchema: {
-        type: "object",
-        properties: { location: { type: "string" } },
-        required: ["location"],
-      },
-      handler: async ({ location }) => fetchWeather(location),
-    }),
-    open_modal: tool({
-      description: "Open a UI modal",
-      inputSchema: z.object({ id: z.string() }),
-      handler: async ({ id }) => {
-        openModal(id);
-        return { success: true };
-      },
-    }),
-  });
-}
-```
-
-### UseToolConfig reference
-
-```typescript
-interface UseToolConfig<TParams> {
-  name: string;
-  description: string;
-  inputSchema: ZodSchema | JSONSchema; // Both accepted
-  handler: (
-    params: TParams,
-    context?: ToolContext,
-  ) => Promise<ToolResponse> | ToolResponse;
-
-  // UI
-  render?: (props: ToolRenderProps<TParams>) => React.ReactNode;
-  title?: string | ((args: TParams) => string);
-  executingTitle?: string | ((args: TParams) => string);
-  completedTitle?: string | ((args: TParams) => string);
-
-  // Behaviour
-  available?: boolean; // default: true
-  needsApproval?: boolean;
-  approvalMessage?: string | ((params: TParams) => string);
-  hidden?: boolean; // default: false — see Hidden Tools
-  aiResponseMode?: AIResponseMode;
-  aiContext?: string | ((result, args) => string);
-  resultConfig?: ToolResultConfig;
-
-  // Loading strategy
-  deferLoading?: boolean; // see Deferred Tools
-  profiles?: string[];
-  searchKeywords?: string[];
-  group?: string;
-  category?: string;
-}
-```
-
-### Deferred Tools
-
-Large tool registries can bloat the LLM request payload. Mark tools with `deferLoading: true` to keep them out of the default request — they are auto-detected and injected only when the user's query semantically matches the tool.
-
-```typescript
-useTool({
-  name: "run_sql_query",
-  description: "Execute a SQL query against the database",
-  deferLoading: true, // Not sent on every request
-  searchKeywords: ["sql", "query", "database", "table"],
-  inputSchema: z.object({ query: z.string() }),
-  handler: async ({ query }) => db.execute(query),
-});
-```
-
-Auto-detection uses `description` + `searchKeywords` to score relevance against the current message. No configuration required.
-
-### Hidden Tools
-
-Register tools that execute silently — they run when called by the AI but are never shown in the tool execution UI.
-
-```typescript
-useTool({
-  name: "log_analytics_event",
-  description: "Log a UI analytics event",
-  hidden: true, // Never rendered in chat UI
-  inputSchema: z.object({ event: z.string(), data: z.record(z.unknown()) }),
-  handler: async ({ event, data }) => {
-    analytics.track(event, data);
-    return {};
-  },
-});
-```
-
-### Fallback Tool Renderer
-
-The `<CopilotChat>` component resolves a renderer for each tool execution using this priority chain:
-
-1. **`toolRenderers[toolName]`** — per-tool renderer map passed to `<CopilotChat>`
-2. **`tool.render`** — render function attached to the `ToolDefinition` via `useTool`
-3. **`mcpToolRenderer`** — catch-all for tools with `source: "mcp"`
-4. **`fallbackToolRenderer`** — catch-all for any tool not matched above
-5. **Built-in default** — generic tool execution card
-
-```tsx
-<CopilotChat
-  // Highest priority — per-tool
-  toolRenderers={{
-    get_weather: ({ args, result }) => <WeatherCard {...result} />,
-  }}
-  // MCP catch-all
-  mcpToolRenderer={({ toolName, args, result }) => <MCPCard name={toolName} />}
-  // Universal catch-all
-  fallbackToolRenderer={({ toolName, args, result }) => (
-    <pre>{JSON.stringify(result, null, 2)}</pre>
-  )}
-/>
-```
-
----
-
-## 8. Message Grouping
-
-`groupConsecutiveMessages` groups consecutive messages of the same role into visual clusters. Useful for building custom chat UIs where adjacent user or assistant messages should appear as one block.
-
-Available from the message-utils module:
-
-```typescript
-import {
-  toLLMMessages,
-  toLLMMessage,
-  keepToolPairsAtomic,
-} from "@yourgpt/copilot-sdk-react";
-```
-
-Core invariant: **tool-call pairs are always atomic.** An assistant message with `tool_calls` is never separated from its corresponding tool-result messages during any windowing or pruning operation.
-
----
-
-## 9. Server: compactSession
-
-The `compactSession` utility powers the `/api/compact` endpoint for `summary-buffer` compaction. It calls Claude (defaults to `claude-haiku-4-5`) to produce a structured summary that preserves:
-
-- User goals and requests
-- Technical decisions and chosen approaches
-- Tool call outcomes (name, key args, result status)
-- Errors and resolutions
-- Pending tasks and current work state
-
-```typescript
-// app/api/compact/route.ts
-import { compactSession } from "@yourgpt/copilot-sdk/server";
-
-export async function POST(req: Request) {
-  const { messages, existingSummary, workingMemory } = await req.json();
-
-  const { summary } = await compactSession({
-    messages,
-    existingSummary, // Passed in subsequent compactions for rolling summaries
-    workingMemory, // User-pinned facts (addToWorkingMemory)
-    model: "claude-haiku-4-5", // default
-    maxSummaryTokens: 1024, // default
-    apiKey: process.env.ANTHROPIC_API_KEY,
-  });
-
-  return Response.json({ summary });
-}
-```
-
-### CompactSessionOptions
-
-```typescript
-interface CompactSessionOptions {
-  messages: Array<{ role: string; content?: string | null }>;
-  existingSummary?: string | null;
-  workingMemory?: string[];
-  model?: string; // default: "claude-haiku-4-5"
-  maxSummaryTokens?: number; // default: 1024
-  apiKey?: string; // fallback: process.env.ANTHROPIC_API_KEY
-  apiBaseUrl?: string; // default: "https://api.anthropic.com"
-  fetchImpl?: typeof fetch;
-}
-```
-
----
-
-## Quick-start: Full Setup
-
-```tsx
-// app/layout.tsx
-import { CopilotProvider } from "@yourgpt/copilot-sdk-react";
-
-export default function RootLayout({ children }) {
-  return (
-    <CopilotProvider
-      widgetToken="YOUR_TOKEN"
-      messageHistory={{
-        strategy: "summary-buffer",
-        maxContextTokens: 128000,
-        compactionThreshold: 0.75,
-        compactionUrl: "/api/compact",
-        persistSession: true,
-        storageKey: "my-app",
-        onCompaction: (e) => console.log("Compacted:", e),
-      }}
-    >
-      {children}
-    </CopilotProvider>
-  );
-}
-```
-
-```tsx
-// components/ChatPanel.tsx
-import { useMessageHistory, useContextStats } from "@yourgpt/copilot-sdk-react";
-
-export function ChatPanel() {
-  const { tokenUsage, isCompacting, compactSession } = useMessageHistory();
-  const { usagePercent, toolCount } = useContextStats();
-
-  return (
-    <div>
-      <p>
-        {Math.round(usagePercent * 100)}% context used · {toolCount} tools
-      </p>
-      {tokenUsage.isApproaching && (
-        <button onClick={() => compactSession()}>Compact now</button>
-      )}
-      {isCompacting && <span>Summarizing history…</span>}
-    </div>
-  );
-}
-```
diff --git a/apps/docs/alpha-docs/CUSTOM-MESSAGE-VIEW.md b/apps/docs/alpha-docs/CUSTOM-MESSAGE-VIEW.md
deleted file mode 100644
index 824804d..0000000
--- a/apps/docs/alpha-docs/CUSTOM-MESSAGE-VIEW.md
+++ /dev/null
@@ -1,173 +0,0 @@
-# Custom Message View
-
-> `release/alpha` — adds a `messageView` prop to `CopilotChat` / `Chat` that gives full control over how the message list is rendered. Inject custom UI, conditionally replace messages based on `metadata.type`, or build entirely custom layouts — without touching roles or message history.
-
----
-
-## Table of Contents
-
-1. [What Was Built](#what-was-built)
-2. [Breaking Changes](#breaking-changes)
-3. [New API](#new-api)
-4. [Usage Examples](#usage-examples)
-5. [How It Works Internally](#how-it-works-internally)
-6. [Roadmap — Chat.\* Primitives](#roadmap--chat-primitives)
-
----
-
-## What Was Built
-
-A `messageView` prop on `<CopilotChat>` / `<Chat>` that intercepts message list rendering.
-
-You receive:
-
-- **`messageElements`** — pre-rendered default SDK elements (one per message, may include `null` for filtered messages)
-- **`messages`** — raw `ChatMessage[]` for conditional logic
-
-This closes the use case from **issue #74** (custom message types with dedicated renderers) without touching the `role` union or message history format.
-
----
-
-## Breaking Changes
-
-**None.** Fully additive. Existing `renderMessage`, `toolRenderers`, and all other props are unchanged.
-
----
-
-## New API
-
-### `messageView` prop
-
-Added to `ChatProps` (and flows through to `CopilotChat` via `...chatProps`).
-
-```ts
-messageView?: {
-  children?: (props: {
-    /** Raw messages array */
-    messages: ChatMessage[];
-    /** Pre-rendered default SDK elements, one per message */
-    messageElements: React.ReactNode[];
-  }) => React.ReactNode;
-};
-```
-
----
-
-## Usage Examples
-
-### Inject custom UI below messages
-
-```tsx
-<CopilotChat
-  messageView={{
-    children: ({ messageElements }) => (
-      <>
-        {messageElements}
-        <div className="p-4 text-center text-sm text-muted-foreground">
-          Powered by YourGPT
-        </div>
-      </>
-    ),
-  }}
-/>
-```
-
-### Custom message types via `metadata.type`
-
-Inject a custom message into the chat (e.g. from a tool handler or agent state), then render it with your own component:
-
-```tsx
-<CopilotChat
-  messageView={{
-    children: ({ messages, messageElements }) => (
-      <>
-        {messages.map((message, i) => {
-          if (message.metadata?.type === "plan") {
-            return <PlanCard key={message.id} data={message.metadata} />;
-          }
-          if (message.metadata?.type === "approval") {
-            return <ApprovalCard key={message.id} data={message.metadata} />;
-          }
-          return messageElements[i];
-        })}
-      </>
-    ),
-  }}
-/>
-```
-
-### Combine with agent state
-
-```tsx
-function Chat() {
-  const agentState = useMyAgentState();
-
-  return (
-    <CopilotChat
-      messageView={{
-        children: ({ messageElements }) => (
-          <div className="flex flex-col gap-4">
-            {messageElements}
-            {agentState?.steps && <TaskProgress steps={agentState.steps} />}
-          </div>
-        ),
-      }}
-    />
-  );
-}
-```
-
----
-
-## How It Works Internally
-
-**Files changed:** `types.ts`, `chat.tsx` (2 files, ~30 lines total)
-
-In `chat.tsx`, the `messages.map(...)` loop is wrapped in an IIFE that collects rendered elements into a `messageElements` array first, then either:
-
-- Passes them to `messageView.children({ messages, messageElements })` if provided
-- Or renders them directly (existing behaviour)
-
-```tsx
-{
-  (() => {
-    const messageElements = messages.map((message, index) => {
-      // ...existing render logic unchanged...
-    });
-
-    return messageView?.children
-      ? messageView.children({ messages, messageElements })
-      : messageElements;
-  })();
-}
-```
-
-The loading placeholder and scroll anchor remain outside this block and are unaffected.
-
-`connected-chat.tsx` required no changes — `messageView` flows through automatically via `...chatProps`.
-
----
-
-## `Chat.*` Primitives — Now Shipped
-
-The headless primitive API described here as a roadmap item has shipped in this same alpha. You can use it today:
-
-```tsx
-import { ChatPrimitives as Chat } from "@yourgpt/copilot-sdk/ui";
-
-<CopilotChat>
-  <Chat.MessageList>
-    {(message) =>
-      message.metadata?.type === "plan" ? (
-        <PlanCard key={message.id} />
-      ) : (
-        <Chat.DefaultMessage key={message.id} message={message} />
-      )
-    }
-  </Chat.MessageList>
-</CopilotChat>;
-```
-
-`messageView` remains the simpler option for quick overrides. `Chat.MessageList` is the lower-level primitive when you need full layout control. Both work — no migration needed between them.
-
-→ Full primitives docs: [CHAT-PRIMITIVES.md](./CHAT-PRIMITIVES.md)
diff --git a/apps/docs/alpha-docs/MESSAGE-ACTIONS.md b/apps/docs/alpha-docs/MESSAGE-ACTIONS.md
deleted file mode 100644
index 7759157..0000000
--- a/apps/docs/alpha-docs/MESSAGE-ACTIONS.md
+++ /dev/null
@@ -1,247 +0,0 @@
-# Message Actions
-
-> `release/alpha` — adds a compound component API for registering floating action buttons on chat messages. Declarative, role-based, fully composable — same pattern as shadcn/Radix.
-
----
-
-## Table of Contents
-
-1. [What Was Built](#what-was-built)
-2. [Breaking Changes](#breaking-changes)
-3. [New APIs](#new-apis)
-4. [Usage Examples](#usage-examples)
-5. [How It Works Internally](#how-it-works-internally)
-6. [Also Shipped — ChatPrimitives Namespace](#also-shipped--chatprimitives-namespace)
-
----
-
-## What Was Built
-
-A compound component API for adding floating action buttons to chat messages — copy, edit, feedback, or fully custom actions — declared as children of `<CopilotChat>`.
-
-Actions appear on hover, floating below the message bubble. Role-based — configure `assistant` and `user` separately.
-
----
-
-## Breaking Changes
-
-**None.** If no `<CopilotChat.MessageActions>` children are declared, nothing changes. Existing chat UI looks and behaves identically.
-
----
-
-## New APIs
-
-### Compound components
-
-```
-CopilotChat.MessageActions   — registers actions for a role
-CopilotChat.CopyAction       — built-in copy to clipboard (with check feedback)
-CopilotChat.EditAction       — built-in edit (user messages, wired to inline edit)
-CopilotChat.FeedbackAction   — built-in thumbs up/down
-CopilotChat.Action           — fully custom action
-```
-
-### Props
-
-```tsx
-// MessageActions
-role: "user" | "assistant"
-
-// CopyAction
-tooltip?: string
-className?: string
-
-// EditAction
-tooltip?: string
-className?: string
-
-// FeedbackAction
-onFeedback?: (message: ChatMessage, type: "helpful" | "not-helpful") => void
-tooltip?: string
-className?: string
-
-// Action
-id?: string
-icon: ReactNode
-tooltip: string
-onClick: (props: { message: ChatMessage }) => void
-hidden?: boolean | ((props: { message: ChatMessage }) => boolean)
-className?: string
-```
-
----
-
-## Usage Examples
-
-### Zero config — no actions (default)
-
-```tsx
-<CopilotChat />
-// No action buttons shown — clean slate
-```
-
----
-
-### Copy on assistant, Edit on user
-
-```tsx
-<CopilotChat>
-  <CopilotChat.MessageActions role="assistant">
-    <CopilotChat.CopyAction />
-  </CopilotChat.MessageActions>
-
-  <CopilotChat.MessageActions role="user">
-    <CopilotChat.EditAction />
-  </CopilotChat.MessageActions>
-</CopilotChat>
-```
-
----
-
-### Copy + Feedback on assistant
-
-```tsx
-<CopilotChat>
-  <CopilotChat.MessageActions role="assistant">
-    <CopilotChat.CopyAction />
-    <CopilotChat.FeedbackAction
-      onFeedback={(message, type) => {
-        sendFeedback({ messageId: message.id, type });
-      }}
-    />
-  </CopilotChat.MessageActions>
-</CopilotChat>
-```
-
----
-
-### Custom action
-
-```tsx
-<CopilotChat>
-  <CopilotChat.MessageActions role="assistant">
-    <CopilotChat.CopyAction />
-    <CopilotChat.Action
-      icon={<ShareIcon />}
-      tooltip="Share"
-      onClick={({ message }) => share(message.content)}
-    />
-  </CopilotChat.MessageActions>
-</CopilotChat>
-```
-
----
-
-### Conditional action (hide based on message)
-
-```tsx
-<CopilotChat>
-  <CopilotChat.MessageActions role="assistant">
-    <CopilotChat.CopyAction />
-    <CopilotChat.Action
-      icon={<FlagIcon />}
-      tooltip="Report"
-      hidden={({ message }) => !message.content}
-      onClick={({ message }) => report(message.id)}
-    />
-  </CopilotChat.MessageActions>
-</CopilotChat>
-```
-
----
-
-### Disable all actions for a role
-
-```tsx
-<CopilotChat>
-  <CopilotChat.MessageActions role="assistant">
-    {/* empty — no actions for assistant */}
-  </CopilotChat.MessageActions>
-</CopilotChat>
-```
-
----
-
-### Full setup — both roles
-
-```tsx
-<CopilotChat>
-  <CopilotChat.MessageActions role="assistant">
-    <CopilotChat.CopyAction />
-    <CopilotChat.FeedbackAction onFeedback={(msg, type) => log(msg.id, type)} />
-    <CopilotChat.Action
-      icon={<BookmarkIcon />}
-      tooltip="Save"
-      onClick={({ message }) => save(message)}
-    />
-  </CopilotChat.MessageActions>
-
-  <CopilotChat.MessageActions role="user">
-    <CopilotChat.EditAction />
-    <CopilotChat.Action
-      icon={<DeleteIcon />}
-      tooltip="Delete"
-      onClick={({ message }) => deleteMessage(message.id)}
-    />
-  </CopilotChat.MessageActions>
-</CopilotChat>
-```
-
----
-
-## How It Works Internally
-
-**Files created/modified:**
-
-- `message-actions-context.tsx` _(new)_ — React context storing registered actions per role
-- `message-actions-compound.tsx` _(new)_ — compound components (`MessageActions`, `CopyAction`, `EditAction`, `FeedbackAction`, `Action`)
-- `chat.tsx` — wrapped with `MessageActionsProvider`, compound components added to `Chat.*` namespace
-- `default-message.tsx` — `FloatingActions` helper reads from context, renders on `group-hover/message`
-
-**Flow:**
-
-1. `<CopilotChat.MessageActions role="assistant">` scans its children's props via `React.Children.forEach`, builds a `RegisteredAction[]`
-2. `useLayoutEffect` registers them into `MessageActionsContext`
-3. `DefaultMessage` renders `<FloatingActions>` for each message
-4. `FloatingActions` calls `ctx.getActions(role)` — if empty, renders nothing
-
-**Copy action** has local state (`copiedId`) — switches icon to ✓ for 1.5s then reverts.
-
-**Edit action** routes to the existing `startEdit()` function already in `DefaultMessage` — no duplication.
-
----
-
-## Also Shipped — `ChatPrimitives` Namespace
-
-A `ChatPrimitives` export was also added for headless composition:
-
-```tsx
-import { ChatPrimitives as Chat } from "@yourgpt/copilot-sdk-ui";
-
-<CopilotChat>
-  <Chat.MessageList>
-    {(message) =>
-      message.metadata?.type === "plan" ? (
-        <PlanCard key={message.id} />
-      ) : (
-        <Chat.DefaultMessage key={message.id} message={message} />
-      )
-    }
-  </Chat.MessageList>
-</CopilotChat>;
-```
-
-| Primitive             | Description                                  |
-| --------------------- | -------------------------------------------- |
-| `Chat.MessageList`    | Render-prop message list, reads from context |
-| `Chat.DefaultMessage` | Full SDK message bubble, use as fallback     |
-| `Chat.Header`         | Chat header bar                              |
-| `Chat.Welcome`        | Welcome screen (no messages)                 |
-| `Chat.Input`          | Composer / input box                         |
-| `Chat.ScrollAnchor`   | Auto-scroll anchor                           |
-| `Chat.Message`        | Low-level row wrapper                        |
-| `Chat.MessageAvatar`  | Avatar with fallback                         |
-| `Chat.MessageContent` | Content bubble, supports markdown            |
-| `Chat.MessageActions` | Action bar layout primitive                  |
-| `Chat.MessageAction`  | Single action button with tooltip            |
-| `Chat.Loader`         | Streaming indicator                          |
diff --git a/apps/docs/alpha-docs/SKILLS.md b/apps/docs/alpha-docs/SKILLS.md
deleted file mode 100644
index d2e02aa..0000000
--- a/apps/docs/alpha-docs/SKILLS.md
+++ /dev/null
@@ -1,518 +0,0 @@
-# Skills System
-
-Skills are instruction playbooks the AI loads on demand. They shape the model's **behavior** — separate from Tools, which perform actions.
-
-A skill is a Markdown file (or inline string) containing instructions. Skills can be:
-
-- **eager** — always injected into the system prompt
-- **auto** — listed in a catalog; the AI calls `load_skill` to retrieve them when relevant
-- **manual** — available via `load_skill` but not advertised in the catalog
-
----
-
-## Table of Contents
-
-1. [Concepts](#1-concepts)
-2. [Client-side: SkillProvider + useSkill](#2-client-side-skillprovider--useskill)
-3. [Server-side: loadSkills](#3-server-side-loadskills)
-4. [Skill File Format](#4-skill-file-format)
-5. [defineSkill helper](#5-defineskill-helper)
-6. [useSkillStatus](#6-useskillstatus)
-7. [Source precedence & collision detection](#7-source-precedence--collision-detection)
-8. [Type Reference](#8-type-reference)
-9. [Full Example](#9-full-example)
-
----
-
-## 1. Concepts
-
-| Strategy | Behavior                                                                                            |
-| -------- | --------------------------------------------------------------------------------------------------- |
-| `eager`  | Content prepended to system prompt on every request. Always active.                                 |
-| `auto`   | Listed in the skill catalog appended to the system prompt. AI calls `load_skill({ name })` to load. |
-| `manual` | Accessible via `load_skill` but not advertised — for internal/conditional skills.                   |
-
-The `load_skill` tool is automatically registered when a `<SkillProvider>` is present (client) or when `loadSkills()` builds the tools object (server). No manual wiring required.
-
----
-
-## 2. Client-side: SkillProvider + useSkill
-
-### SkillProvider
-
-Wrap your app (inside `<CopilotProvider>`) to enable client-side skills:
-
-```tsx
-import { SkillProvider, defineSkill } from "@yourgpt/copilot-sdk-react";
-
-const brandVoice = defineSkill({
-  name: "brand-voice",
-  description: "Ensures responses match our brand tone and terminology",
-  strategy: "eager",
-  source: {
-    type: "inline",
-    content:
-      "Always respond in a friendly, concise tone. Use 'we' not 'I'. Avoid jargon.",
-  },
-});
-
-const codeReview = defineSkill({
-  name: "code-review",
-  description: "Performs structured code reviews with actionable feedback",
-  strategy: "auto", // AI loads this on demand
-  source: {
-    type: "inline",
-    content: "When reviewing code: 1) Check for bugs first...",
-  },
-});
-
-export default function App() {
-  return (
-    <CopilotProvider widgetToken="...">
-      <SkillProvider skills={[brandVoice, codeReview]}>
-        <YourApp />
-      </SkillProvider>
-    </CopilotProvider>
-  );
-}
-```
-
-> **Note:** `<SkillProvider>` only supports `inline` source skills client-side. For `file` or `url` sources, use `loadSkills()` on the server.
-
-### useSkill
-
-Register a skill from deep inside the component tree — it activates on mount and cleans up on unmount.
-
-```tsx
-import { useSkill } from "@yourgpt/copilot-sdk-react";
-
-function CheckoutPage() {
-  useSkill({
-    name: "checkout-flow",
-    description: "Guides the user through the checkout process step by step",
-    strategy: "auto",
-    source: {
-      type: "inline",
-      content: `
-## Checkout Assistant
-
-When the user asks about checkout:
-1. Confirm their cart items
-2. Check for applicable promo codes
-3. Walk through shipping options
-4. Confirm payment method before submitting
-      `,
-    },
-  });
-
-  return <CheckoutUI />;
-}
-```
-
-The skill is automatically unregistered when `CheckoutPage` unmounts.
-
-**Dev warning:** If an inline skill exceeds 2000 characters in development, a console warning is shown. Large inline skills are sent on every request — consider using a server-side file skill instead.
-
----
-
-## 3. Server-side: loadSkills
-
-For `file` and `url` sources, or when you want server-controlled skill loading:
-
-```typescript
-// app/api/chat/route.ts
-import path from "path";
-import { loadSkills } from "@yourgpt/copilot-sdk/server";
-
-export async function POST(req: Request) {
-  const { messages, __skills } = await req.json();
-
-  const { skills, buildSystemPrompt, tools, diagnostics } = await loadSkills({
-    // Source 1: .md files from a local directory (highest precedence)
-    dir: path.join(process.cwd(), "skills"),
-
-    // Source 2: Remote .md URLs
-    remoteUrls: ["https://cdn.myapp.com/skills/support-policy.md"],
-
-    // Source 3: Inline skills forwarded from client (lowest precedence)
-    clientSkills: __skills ?? [],
-  });
-
-  // Log any name collisions
-  if (diagnostics.length) {
-    console.warn("Skill collisions:", diagnostics);
-  }
-
-  const systemPrompt = buildSystemPrompt(
-    "You are a helpful assistant for Acme Corp.",
-  );
-
-  // Pass tools.load_skill to your AI provider
-  return streamText({
-    model: anthropic("claude-sonnet-4-6"),
-    system: systemPrompt,
-    messages,
-    tools: {
-      ...tools, // includes load_skill
-      ...myOtherTools,
-    },
-  });
-}
-```
-
-### loadSkills options
-
-```typescript
-interface LoadSkillsOptions {
-  dir?: string; // Path to /skills directory (Node.js only)
-  remoteUrls?: string[]; // Remote .md URLs to fetch
-  clientSkills?: ClientInlineSkill[]; // Forwarded from useSkill() hooks
-}
-```
-
-### loadSkills result
-
-```typescript
-interface LoadSkillsResult {
-  skills: ResolvedSkill[];
-  diagnostics: SkillDiagnostic[];
-
-  // Build system prompt: prepends eager content, appends auto catalog
-  buildSystemPrompt(basePrompt?: string): string;
-
-  // Ready-to-use load_skill tool definition
-  tools: {
-    load_skill: {
-      description: string;
-      parameters: { ... };
-      execute: (args: { name: string }) => Promise<LoadSkillResult | LoadSkillError>;
-    };
-  };
-}
-```
-
-### Forwarding client skills to the server
-
-`<SkillProvider>` automatically syncs inline skills to `CopilotProvider`, which includes them in every API request as `__skills`. Read them in your route handler:
-
-```typescript
-const { messages, __skills } = await req.json();
-
-const { buildSystemPrompt, tools } = await loadSkills({
-  dir: path.join(process.cwd(), "skills"),
-  clientSkills: __skills ?? [], // Inline skills from useSkill() hooks
-});
-```
-
----
-
-## 4. Skill File Format
-
-Skill files are Markdown with an optional YAML frontmatter block.
-
-```markdown
----
-name: code-review
-description: Performs structured code reviews with actionable feedback
-strategy: auto
-version: 1.2.0
----
-
-## Code Review Instructions
-
-When asked to review code, follow this structure:
-
-1. **Correctness** — Check for logic errors and edge cases
-2. **Security** — Flag injection risks, exposed secrets, insecure defaults
-3. **Performance** — Note O(n²) loops, unnecessary re-renders, missing indexes
-4. **Style** — Suggest naming and structure improvements (non-blocking)
-
-Always include a summary section with an overall assessment.
-```
-
-### Frontmatter fields
-
-| Field         | Required    | Description                                                                           |
-| ------------- | ----------- | ------------------------------------------------------------------------------------- |
-| `name`        | Recommended | Skill name. Derived from filename if omitted (e.g. `code-review.md` → `code-review`). |
-| `description` | Recommended | One-line description shown in the AI's skill catalog.                                 |
-| `strategy`    | No          | `eager`, `auto`, or `manual`. Default: `auto`.                                        |
-| `version`     | No          | Informational version string.                                                         |
-
-### Directory layout
-
-```
-skills/
-├── brand-voice.md          # Flat .md file
-├── code-review.md
-└── sql-expert/
-    └── SKILL.md            # Folder-based skill (use for multi-file skills)
-```
-
-For folder-based skills, place the main skill file at `<folder>/SKILL.md`. The folder name is used as the skill name unless overridden by frontmatter.
-
----
-
-## 5. defineSkill helper
-
-Type-safe factory for creating skill definitions. An identity function with TypeScript inference — same pattern as `useTool`.
-
-```typescript
-import { defineSkill } from "@yourgpt/copilot-sdk-react";
-// or from server:
-import { defineSkill } from "@yourgpt/copilot-sdk/server";
-
-const mySkill = defineSkill({
-  name: "api-docs-helper",
-  description: "Helps users understand and use the Acme API",
-  strategy: "auto",
-  version: "2.0.0",
-  source: {
-    type: "inline",
-    content: "When explaining API endpoints, always include example requests...",
-  },
-});
-
-// Reuse in multiple providers
-<SkillProvider skills={[mySkill]} />
-```
-
----
-
-## 6. useSkillStatus
-
-Observe the live skill registry state from any component inside `<SkillProvider>`:
-
-```tsx
-import { useSkillStatus } from "@yourgpt/copilot-sdk-react";
-
-function DebugPanel() {
-  const { skills, count, has } = useSkillStatus();
-
-  return (
-    <div>
-      <p>{count} skill(s) active</p>
-      {has("code-review") && <Badge>Code Review</Badge>}
-      <ul>
-        {skills.map((s) => (
-          <li key={s.name}>
-            {s.name} ({s.strategy ?? "auto"})
-          </li>
-        ))}
-      </ul>
-    </div>
-  );
-}
-```
-
-### Return type
-
-```typescript
-interface UseSkillStatusReturn {
-  skills: ResolvedSkill[]; // All currently registered skills
-  count: number; // Number of registered skills
-  has: (name: string) => boolean; // Check if a named skill is active
-}
-```
-
----
-
-## 7. Source Precedence & Collision Detection
-
-When the same skill name appears in multiple sources, the higher-precedence source wins and a diagnostic is recorded.
-
-```
-server-dir  >  remote-url  >  client-inline
-```
-
-```typescript
-const { diagnostics } = await loadSkills({ ... });
-
-// diagnostics: SkillDiagnostic[]
-// [{
-//   type: "collision",
-//   name: "code-review",
-//   winner: "server-dir",
-//   loser: "client-inline",
-// }]
-```
-
-This lets you safely override client-provided skills with authoritative server versions — for example, preventing users from injecting their own `brand-voice` skill that conflicts with your official one.
-
----
-
-## 8. Type Reference
-
-```typescript
-type SkillStrategy = "eager" | "auto" | "manual";
-
-type SkillSource =
-  | { type: "inline"; content: string }
-  | { type: "url"; url: string }
-  | { type: "file"; path: string };
-
-interface SkillDefinition {
-  name: string;
-  description: string;
-  source: SkillSource;
-  strategy?: SkillStrategy; // default: "auto"
-  version?: string;
-}
-
-interface ResolvedSkill extends SkillDefinition {
-  content: string; // Fully resolved content string
-}
-
-interface ClientInlineSkill {
-  name: string;
-  description: string;
-  content: string;
-  strategy?: SkillStrategy;
-}
-
-interface SkillDiagnostic {
-  type: "collision";
-  name: string;
-  winner: "server-dir" | "remote-url" | "client-inline";
-  loser: "server-dir" | "remote-url" | "client-inline";
-}
-
-interface LoadSkillResult {
-  name: string;
-  description: string;
-  strategy: SkillStrategy;
-  content: string;
-  source: "server-dir" | "remote-url" | "client-inline";
-}
-
-interface LoadSkillError {
-  error: string;
-}
-```
-
----
-
-## 9. Full Example
-
-### Project structure
-
-```
-skills/
-├── brand-voice.md     # eager — always active
-└── sql-expert.md      # auto — loaded on demand
-```
-
-```markdown
-## <!-- skills/brand-voice.md -->
-
-name: brand-voice
-description: Acme Corp tone and style guide
-strategy: eager
-
----
-
-Always respond in a friendly, professional tone.
-Refer to the product as "Acme" (not "the platform").
-Use metric units. Avoid passive voice.
-```
-
-```markdown
-## <!-- skills/sql-expert.md -->
-
-name: sql-expert
-description: Writes and explains SQL queries for our PostgreSQL schema
-strategy: auto
-
----
-
-## SQL Expert
-
-Our database uses PostgreSQL 15. Key tables:
-
-- users(id, email, plan, created_at)
-- orders(id, user_id, total, status, created_at)
-- products(id, name, price, stock)
-
-When writing queries:
-
-1. Always use parameterized queries ($1, $2...)
-2. Add LIMIT clauses to SELECT queries
-3. Explain the query in plain English after writing it
-```
-
-### API route
-
-```typescript
-// app/api/chat/route.ts
-import path from "path";
-import { loadSkills } from "@yourgpt/copilot-sdk/server";
-import { streamText } from "ai";
-import { anthropic } from "@ai-sdk/anthropic";
-
-export async function POST(req: Request) {
-  const { messages, __skills } = await req.json();
-
-  const { buildSystemPrompt, tools } = await loadSkills({
-    dir: path.join(process.cwd(), "skills"),
-    clientSkills: __skills ?? [],
-  });
-
-  return streamText({
-    model: anthropic("claude-sonnet-4-6"),
-    system: buildSystemPrompt("You are a helpful assistant for Acme Corp."),
-    messages,
-    tools,
-  }).toDataStreamResponse();
-}
-```
-
-### React app
-
-```tsx
-// app/layout.tsx
-import { CopilotProvider } from "@yourgpt/copilot-sdk-react";
-import { SkillProvider, defineSkill } from "@yourgpt/copilot-sdk-react";
-
-// Extra client-only skill (e.g. page-specific context)
-const checkoutSkill = defineSkill({
-  name: "checkout-helper",
-  description: "Helps with the checkout flow",
-  strategy: "auto",
-  source: { type: "inline", content: "When helping with checkout..." },
-});
-
-export default function Layout({ children }) {
-  return (
-    <CopilotProvider widgetToken="YOUR_TOKEN" apiUrl="/api/chat">
-      <SkillProvider skills={[checkoutSkill]}>{children}</SkillProvider>
-    </CopilotProvider>
-  );
-}
-```
-
-```tsx
-// app/dashboard/page.tsx — add a page-scoped skill
-import { useSkill, useSkillStatus } from "@yourgpt/copilot-sdk-react";
-
-export default function DashboardPage() {
-  useSkill({
-    name: "dashboard-context",
-    description: "Knows about the current dashboard state",
-    strategy: "eager",
-    source: {
-      type: "inline",
-      content:
-        "The user is viewing the analytics dashboard. Current date range: last 30 days.",
-    },
-  });
-
-  const { count } = useSkillStatus();
-
-  return (
-    <div>
-      <p>{count} skills active</p>
-      <Dashboard />
-    </div>
-  );
-}
-```
diff --git a/apps/docs/alpha-docs/STORAGE-ADAPTER.md b/apps/docs/alpha-docs/STORAGE-ADAPTER.md
deleted file mode 100644
index d17b9e5..0000000
--- a/apps/docs/alpha-docs/STORAGE-ADAPTER.md
+++ /dev/null
@@ -1,166 +0,0 @@
-# Storage Adapter (Alpha)
-
-> **Status**: Alpha — API may change. Available since `@yourgpt/llm-sdk@1.5.0-alpha`.
-
-## Quick Start
-
-```ts
-import { createRuntime } from "@yourgpt/llm-sdk";
-import { createAnthropic } from "@yourgpt/llm-sdk/anthropic";
-import { createYourGPT } from "@yourgpt/llm-sdk/yourgpt";
-
-// 1. Create adapter (server-side only)
-const yourgpt = createYourGPT({
-  apiKey: process.env.YOURGPT_API_KEY,
-  widgetUid: process.env.YOURGPT_WIDGET_UID,
-  // endpoint defaults to https://api.yourgpt.ai
-  // Override for dev: endpoint: 'http://localhost:3000'
-});
-
-// 2. Plug into runtime
-const runtime = createRuntime({
-  provider: createAnthropic({ apiKey: process.env.ANTHROPIC_API_KEY }),
-  model: "claude-haiku-4-5",
-  storage: yourgpt, // ← enables automatic persistence
-});
-
-// 3. Endpoints are one-liners
-app.post("/api/copilot/chat", async (req, res) => {
-  const result = await runtime.chat(req.body);
-  res.json(result); // includes threadId
-});
-
-app.post("/api/copilot/stream", async (req, res) => {
-  await runtime.stream(req.body).pipeToResponse(res);
-});
-
-// 4. Optional: file upload
-app.post("/api/copilot/upload", async (req, res) => {
-  const result = await yourgpt.uploadFile(req.body);
-  res.json(result);
-});
-```
-
-## What Happens Automatically
-
-| Event                       | Without storage         | With storage                                   |
-| --------------------------- | ----------------------- | ---------------------------------------------- |
-| First message (no threadId) | Uses local thread ID    | Creates session via API, returns real threadId |
-| User sends message          | Just forwarded to LLM   | Saved to session, then forwarded               |
-| LLM responds                | Just returned to client | Saved to session, then returned                |
-| Tool calls + results        | Not persisted           | Saved as tool messages                         |
-| File attachment             | Base64 in payload       | Uploaded to storage, URL in payload            |
-| Session creation fails      | N/A                     | Fallback local ID, chat continues              |
-
-## Configuration
-
-### `createYourGPT(config)`
-
-| Option      | Required | Default                  | Description                |
-| ----------- | -------- | ------------------------ | -------------------------- |
-| `apiKey`    | Yes      | —                        | YourGPT API key            |
-| `widgetUid` | Yes      | —                        | Widget UID (project scope) |
-| `endpoint`  | No       | `https://api.yourgpt.ai` | API base URL               |
-
-### `createRuntime({ storage })`
-
-The `storage` option accepts any `StorageAdapter`. The runtime calls:
-
-- `storage.createSession()` — when request has no threadId
-- `storage.saveMessages()` — before + after LLM call
-- `storage.uploadFile()` — not called by runtime (used via upload endpoint)
-
-### Environment Variables (Server)
-
-```env
-# Required
-YOURGPT_API_KEY=apk-your-key-here
-YOURGPT_WIDGET_UID=your-widget-uid-here
-
-# Optional (defaults to production)
-YOURGPT_API_ENDPOINT=https://api.yourgpt.ai
-
-# LLM provider
-ANTHROPIC_API_KEY=sk-ant-...
-```
-
-## Client Setup
-
-No special client configuration needed for sessions. The client SDK automatically:
-
-1. Reads `threadId` from server response
-2. Uses it for subsequent requests
-3. Uses it as the local thread ID (single ID system)
-
-### File uploads (client)
-
-The `upload` prop handles all upload modes — string, object, or function:
-
-```tsx
-// Simple — just a URL:
-<CopilotChat upload="/api/copilot/upload" attachmentsEnabled />
-
-// With auth headers:
-<CopilotChat upload={{
-  url: "/api/copilot/upload",
-  headers: () => ({ Authorization: `Bearer ${token}` }),
-}} />
-
-// Full custom:
-<CopilotChat upload={async (file) => {
-  const url = await myS3Upload(file);
-  return { type: 'image', url, mimeType: file.type, filename: file.name };
-}} />
-```
-
-## Custom StorageAdapter
-
-Implement the interface for any backend:
-
-```ts
-import type { StorageAdapter } from "@yourgpt/llm-sdk";
-
-const myStorage: StorageAdapter = {
-  async createSession(data) {
-    // Your DB call
-    return { id: "session-123" };
-  },
-  async saveMessages(sessionId, messages) {
-    // Your DB call
-  },
-  // Optional:
-  async uploadFile(file) {
-    // Your storage call
-    return { url: "https://..." };
-  },
-};
-
-const runtime = createRuntime({ provider, model, storage: myStorage });
-```
-
-## Error Handling
-
-- `createSession` failure → Fallback local ID, storage skipped, chat works
-- `saveMessages` failure → Logged, chat continues (fire-and-forget)
-- `uploadFile` failure → Error returned to client (4xx/5xx)
-- All errors are logged with `[Runtime]` prefix
-
-### `onError` callback
-
-```ts
-const yourgpt = createYourGPT({
-  apiKey,
-  widgetUid,
-  onError: (error, operation, params) => {
-    // operation: "createSession" | "saveMessages" | "uploadFile"
-    // params: { sessionId, messageCount, roles, filename, mimeType, ... }
-    logger.error(`[YourGPT:${operation}]`, error.message, params);
-  },
-});
-```
-
-## Alpha Notes
-
-- The `endpoint` option in `createYourGPT` will become internal in GA (defaults to production API)
-- `getSessions()` and `getMessages()` on StorageAdapter are reserved for future thread sync
-- File upload uses pre-signed URLs via `/copilot-sdk/getSignedUrl` — contract may change
diff --git a/apps/docs/alpha-docs/message-history-compaction.md b/apps/docs/alpha-docs/message-history-compaction.md
deleted file mode 100644
index bd38e99..0000000
--- a/apps/docs/alpha-docs/message-history-compaction.md
+++ /dev/null
@@ -1,72 +0,0 @@
-# Message History & Compaction
-
-Automatic context window management. Keeps long conversations within token limits without losing important history.
-
-## Strategies
-
-| Strategy          | What it does                                             |
-| ----------------- | -------------------------------------------------------- |
-| `none` (default)  | No compaction — current behavior, zero breaking changes  |
-| `sliding-window`  | Drop oldest messages when over token budget              |
-| `selective-prune` | Drop tool results from old turns, keep summaries         |
-| `summary-buffer`  | Summarize old turns into a rolling summary (recommended) |
-
-## Usage
-
-```tsx
-<CopilotProvider
-  runtimeUrl="/api/chat"
-  messageHistory={{
-    strategy: "summary-buffer",
-    maxContextTokens: 80_000,      // total context budget
-    reserveForResponse: 4096,      // tokens reserved for AI reply
-    recentBuffer: 40,              // keep last N messages verbatim
-    compactionThreshold: 0.75,     // compact at 75% full
-    toolResultMaxChars: 80_000,    // max chars per tool result
-    persistSession: false,
-    onCompaction: (e) => console.log("Compacted", e),
-    onTokenUsage: (u) => console.log(`${u.percentage * 100}% full`),
-  }}
->
-```
-
-## How It Works
-
-**Architecture**: `MessageHistoryBridge` (mounted inside `CopilotProvider`) wires `useMessageHistory` into `AbstractChat.buildRequest()` via `setRequestMessageTransform`.
-
-```
-User sends message
-  → AbstractChat.buildRequest() calls requestMessageTransform(allMessages)
-  → Transform splits: historyMessages (before last user msg) + currentTurn (from last user msg)
-  → buildSummaryBufferContext() compacts historyMessages only
-  → currentTurn always kept verbatim (no broken tool call/result pairs)
-  → Compacted history + currentTurn sent to API
-  → In-memory store unchanged (full history kept for display)
-```
-
-**Auto-compaction**: When `tokenUsage.isApproaching = true` (threshold crossed), `runCompaction` summarizes old messages and updates `compactionState.rollingSummary`. The transform picks up the new summary automatically on next request.
-
-**UI indicators**: When compaction triggers, a system message (`type: "compaction-marker"`) is added to chat:
-
-- Loading: `"Compacting conversation…"` (while summarizing)
-- Done: `"Conversation compacted — context window refreshed"` (permanent divider)
-
-## Token Counting
-
-Token usage is computed from the **full display history** (`toLLMMessages(displayMessages)`), not the already-pruned output. This ensures the threshold reflects actual accumulation.
-
-```tsx
-// Access token usage directly
-const { tokenUsage, compactionState } = useMessageHistory();
-// tokenUsage.current, .max, .percentage, .isApproaching
-// compactionState.compactionCount, .rollingSummary, .totalTokensSaved
-```
-
-## Manual Compaction
-
-```tsx
-const { compactSession } = useMessageHistory();
-
-// Trigger manually with optional instructions
-await compactSession("Focus on user preferences and key decisions");
-```
diff --git a/apps/docs/alpha-docs/skills-system.md b/apps/docs/alpha-docs/skills-system.md
deleted file mode 100644
index 033e918..0000000
--- a/apps/docs/alpha-docs/skills-system.md
+++ /dev/null
@@ -1,63 +0,0 @@
-# Skills System
-
-On-demand instruction sets the AI can load at runtime — keeps the system prompt lean.
-
-## Two Strategies
-
-| Strategy | Behavior                                                |
-| -------- | ------------------------------------------------------- |
-| `eager`  | Content injected into AI context immediately on mount   |
-| `auto`   | Listed in catalog; AI calls `load_skill(name)` to fetch |
-
-## API
-
-```tsx
-import { defineSkill, SkillProvider, useSkill } from "@yourgpt/copilot-sdk/react";
-
-// 1. Define a skill
-const diagnosticSkill = defineSkill({
-  name: "diagnostic",
-  description: "Troubleshoot chatbot issues: errors, limits, integrations",
-  strategy: "eager",                         // always in context
-  source: { type: "inline", content: "..." },
-});
-
-const trainingSkill = defineSkill({
-  name: "training",
-  description: "Manage knowledge base: add FAQs, URLs, files",
-  strategy: "auto",                          // AI loads on demand
-  source: { type: "inline", content: "..." },
-});
-
-// 2. Provide at app level
-<CopilotProvider ...>
-  <SkillProvider skills={[diagnosticSkill, trainingSkill]}>
-    {children}
-  </SkillProvider>
-</CopilotProvider>
-
-// 3. Register per-route (auto skills only active on that route)
-function TrainingLayout() {
-  useSkill(trainingSkill); // registers on mount, unregisters on unmount
-  return <Outlet />;
-}
-```
-
-## How It Works
-
-- **Eager**: `SkillProvider` renders an `EagerSkillInjector` which calls `useAIContext` with the skill content. Appears in the AI context as `__skill_eager__:<name>`.
-- **Auto**: A `load_skill` tool is registered. The catalog context lists available auto skills. AI calls `load_skill({ name })` → receives full content in tool result.
-- **Ref counting**: Multiple `useSkill` calls for the same skill are safe — the registry tracks ref counts and only unregisters when count hits 0.
-
-## Runtime Behavior
-
-```
-User navigates to /training
-  → useSkill(trainingSkill) mounts
-  → Catalog updates: "Available skills:\n- training: Manage knowledge base..."
-  → AI can now call load_skill({ name: "training" })
-
-User navigates away
-  → useSkill cleanup fires
-  → training removed from catalog
-```
diff --git a/apps/docs/content/docs/advanced/compaction.mdx b/apps/docs/content/docs/advanced/compaction.mdx
index e548058..45a05c0 100644
--- a/apps/docs/content/docs/advanced/compaction.mdx
+++ b/apps/docs/content/docs/advanced/compaction.mdx
@@ -25,6 +25,42 @@ Every conversation maintains two parallel views:
 
 When compaction fires, a `CompactionMarker` is injected into the display layer so users can see where summarization happened — but the full history is never deleted.
 
+### Types
+
+```typescript
+interface DisplayMessage extends UIMessage {
+  timestamp: number;
+}
+
+interface CompactionMarker extends DisplayMessage {
+  role: "system";
+  type: "compaction-marker";
+  content: string;
+  summarizedMessageIds: string[];
+  tokensSaved: number;
+}
+
+interface LLMMessage {
+  role: "system" | "user" | "assistant" | "tool";
+  content: string;
+  tool_calls?: ToolCall[];
+  tool_call_id?: string;
+}
+```
+
+### Conversion Helpers
+
+```typescript
+import {
+  toDisplayMessage,
+  toLLMMessage,
+  toLLMMessages,
+  keepToolPairsAtomic,
+} from "@yourgpt/copilot-sdk/react";
+```
+
+`keepToolPairsAtomic` ensures that when you slice a window, an `assistant` message with `tool_calls` is never separated from its corresponding tool-result messages.
+
 ---
 
 ## useMessageHistory
@@ -202,3 +238,15 @@ export async function POST(req: Request) {
 <Callout type="info">
 The summary preserves: user goals, technical decisions, tool call outcomes, errors and resolutions, pending tasks.
 </Callout>
+
+---
+
+## Message Grouping
+
+`groupConsecutiveMessages` groups consecutive messages of the same role into visual clusters — useful for custom chat UIs where adjacent user or assistant messages should appear as one block.
+
+```typescript
+import { groupConsecutiveMessages } from "@yourgpt/copilot-sdk/react";
+```
+
+Tool-call pairs are always kept atomic: an assistant message with `tool_calls` is never separated from its corresponding tool-result messages during windowing or pruning.
diff --git a/apps/docs/content/docs/headless/index.mdx b/apps/docs/content/docs/headless/index.mdx
new file mode 100644
index 0000000..3f0e9dc
--- /dev/null
+++ b/apps/docs/content/docs/headless/index.mdx
@@ -0,0 +1,192 @@
+---
+title: Headless Copilot
+description: Build fully custom chat UIs using raw SDK primitives — no built-in components required
+icon: Code
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+The Copilot SDK ships two layers:
+
+| Layer | What it is | When to use |
+|---|---|---|
+| **UI layer** | `<CopilotChat>`, `<CopilotProvider>`, built-in components | Get up and running fast |
+| **Headless layer** | Raw hooks, stream events, per-message state | Build your own UI from scratch |
+
+The headless layer gives you full control — your own message bubbles, your own tool indicators, your own thinking step visualiser, your own artifact previews — without forking or overriding SDK internals.
+
+---
+
+## Philosophy
+
+The headless API follows a **primitives, not patterns** approach. Rather than shipping opinionated hooks like `useThinkingSteps()` that bake in a specific data shape, the SDK exposes two low-level primitives that let you compose anything:
+
+- **[`useCopilotEvent`](/docs/headless/use-copilot-event)** — subscribe to every raw stream chunk as it arrives
+- **[`useMessageMeta`](/docs/headless/use-message-meta)** — a reactive per-message key-value store you shape yourself
+
+With just these two, you can build thinking step trackers, artifact stores, tool progress badges, plan approval flows, clarifying question UIs — entirely in your own code, with your own types.
+
+---
+
+## Architecture
+
+```
+CopilotProvider
+├── sends messages → runtime API
+├── streams chunks → fires onStreamChunk for each
+│     message:delta, thinking:delta, tool:status,
+│     action:start/end, loop:iteration, loop:complete …
+│
+├── useCopilotEvent('thinking:delta', handler)
+│     └── your handler runs for each thinking chunk
+│
+└── useMessageMeta(messageId)
+      └── reactive store — write anything, read anywhere
+```
+
+---
+
+## Getting started
+
+Install the SDK if you haven't already:
+
+```bash
+npm install @yourgpt/copilot-sdk
+```
+
+Wrap your app with `CopilotProvider` as normal — the headless hooks work inside any component under the provider:
+
+```tsx
+import { CopilotProvider } from '@yourgpt/copilot-sdk/react'
+
+export default function App() {
+  return (
+    <CopilotProvider runtimeUrl="/api/copilot">
+      <YourCustomChatUI />
+    </CopilotProvider>
+  )
+}
+```
+
+Then use `useCopilotEvent` and `useMessageMeta` anywhere inside to build whatever you need.
+
+---
+
+## Full example — custom streaming chat
+
+A complete headless chat UI using only SDK primitives:
+
+```tsx
+import {
+  useCopilot,
+  useCopilotEvent,
+  useMessageMeta,
+} from '@yourgpt/copilot-sdk/react'
+
+// ── Message component ─────────────────────────────────────────────
+interface MyMeta {
+  thinkingText?: string
+  toolsRunning?: string[]
+}
+
+function Message({ message }) {
+  // Read custom metadata we wrote during streaming
+  const { meta } = useMessageMeta<MyMeta>(message.id)
+
+  return (
+    <div className={`message ${message.role}`}>
+      {/* Thinking indicator */}
+      {meta.thinkingText && (
+        <div className="thinking">{meta.thinkingText}</div>
+      )}
+
+      {/* Active tool badges */}
+      {meta.toolsRunning?.map(name => (
+        <span key={name} className="tool-badge">⚙ {name}</span>
+      ))}
+
+      {/* Message content */}
+      <p>{message.content}</p>
+    </div>
+  )
+}
+
+// ── Chat component ────────────────────────────────────────────────
+function MyChat() {
+  const { messages, sendMessage, status } = useCopilot()
+  const [input, setInput] = useState('')
+
+  // Track which message is currently streaming
+  const activeMessageId = useRef<string | null>(null)
+
+  // Capture message start
+  useCopilotEvent('message:start', (e) => {
+    activeMessageId.current = e.id
+  })
+
+  // Build thinking text per message
+  const { updateMeta: updateActiveMeta } = useMessageMeta(activeMessageId.current ?? undefined)
+
+  useCopilotEvent('thinking:delta', (e) => {
+    useMessageMeta — see pattern below for per-message writes
+  })
+
+  // Track tool execution
+  useCopilotEvent('action:start', (e) => {
+    if (!e.messageId) return
+    // write to the message's meta store via a child component or ref pattern
+  })
+
+  return (
+    <div>
+      {messages.map(m => <Message key={m.id} message={m} />)}
+      <input value={input} onChange={e => setInput(e.target.value)} />
+      <button onClick={() => sendMessage(input)}>Send</button>
+    </div>
+  )
+}
+```
+
+<Callout type="info">
+For writing metadata from event handlers that fire before a component mounts,
+use the `messageMeta` store directly from `useCopilot()`:
+
+```tsx
+const { messageMeta } = useCopilot()
+useCopilotEvent('thinking:delta', (e) => {
+  messageMeta.updateMeta(e.messageId!, prev => ({
+    ...prev,
+    thinkingText: (prev.thinkingText ?? '') + e.content
+  }))
+})
+```
+</Callout>
+
+---
+
+## Available stream events
+
+| Event | When it fires | Key fields |
+|---|---|---|
+| `message:start` | New assistant message begins | `id` |
+| `message:delta` | Text token arrives | `content`, `messageId` |
+| `message:end` | Message turn complete | `messageId` |
+| `thinking:delta` | Thinking/reasoning token | `content`, `messageId` |
+| `action:start` | Server tool begins | `id`, `name`, `messageId` |
+| `action:args` | Tool args streamed | `id`, `args`, `messageId` |
+| `action:end` | Server tool completes | `id`, `name`, `result`, `messageId` |
+| `tool:status` | Client tool status change | `id`, `name`, `status`, `messageId` |
+| `tool:result` | Client tool result | `id`, `name`, `result`, `messageId` |
+| `source:add` | Knowledge base source cited | `source`, `messageId` |
+| `loop:iteration` | Agent loop step | `iteration`, `maxIterations`, `messageId` |
+| `loop:complete` | Agent loop finished | `iterations`, `maxIterationsReached`, `messageId` |
+| `*` | Every event | (all fields) |
+
+---
+
+## Next steps
+
+- [`useCopilotEvent`](/docs/headless/use-copilot-event) — full API reference and recipes
+- [`useMessageMeta`](/docs/headless/use-message-meta) — full API reference and recipes
+- [Custom Message View](/docs/customizations/custom-message-view) — intercept rendering inside `<CopilotChat>`
+- [Chat Primitives](/docs/customizations/chat-primitives) — lower-level layout components
diff --git a/apps/docs/content/docs/headless/meta.json b/apps/docs/content/docs/headless/meta.json
new file mode 100644
index 0000000..2271489
--- /dev/null
+++ b/apps/docs/content/docs/headless/meta.json
@@ -0,0 +1,5 @@
+{
+  "title": "Headless",
+  "icon": "Code",
+  "pages": ["index", "use-copilot-event", "use-message-meta"]
+}
diff --git a/apps/docs/content/docs/headless/use-copilot-event.mdx b/apps/docs/content/docs/headless/use-copilot-event.mdx
new file mode 100644
index 0000000..1c802b8
--- /dev/null
+++ b/apps/docs/content/docs/headless/use-copilot-event.mdx
@@ -0,0 +1,176 @@
+---
+title: useCopilotEvent
+description: Subscribe to raw stream chunks as they arrive — build any custom real-time UI
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+`useCopilotEvent` subscribes to the raw stream chunks flowing through the SDK pipeline. Every token, tool call, thinking delta, and loop iteration fires an event — your handler decides what to do with it.
+
+```ts
+import { useCopilotEvent } from '@yourgpt/copilot-sdk/react'
+```
+
+---
+
+## Signature
+
+```ts
+function useCopilotEvent<T extends StreamChunkType | '*'>(
+  eventType: T,
+  handler: (chunk: ChunkOfType<T>) => void
+): void
+```
+
+### Parameters
+
+| Parameter | Type | Description |
+|---|---|---|
+| `eventType` | `StreamChunkType \| '*'` | The event type to listen for, or `'*'` for all |
+| `handler` | `(chunk) => void` | Called for each matching chunk. Handler identity is stable via ref — no re-subscription on re-render |
+
+<Callout type="info">
+The handler is always called with the **latest** version via a ref — you don't need to wrap it in `useCallback`. The hook only resubscribes when `eventType` changes.
+</Callout>
+
+---
+
+## Event types
+
+```ts
+type StreamChunkType =
+  | 'message:start'      // assistant message begins — { id }
+  | 'message:delta'      // text token — { content }
+  | 'message:end'        // message turn complete
+  | 'thinking:delta'     // reasoning token — { content }
+  | 'action:start'       // server tool starts — { id, name, hidden? }
+  | 'action:args'        // server tool args streamed — { id, args }
+  | 'action:end'         // server tool finishes — { id, name, result?, error? }
+  | 'tool:status'        // client tool status — { id, name, status }
+  | 'tool:result'        // client tool result — { id, name, result }
+  | 'source:add'         // knowledge source cited — { source }
+  | 'loop:iteration'     // agent loop step — { iteration, maxIterations }
+  | 'loop:complete'      // agent loop done — { iterations, maxIterationsReached? }
+  | '*'                  // every event
+```
+
+All chunks also include `messageId?: string` — the ID of the assistant message being streamed.
+
+---
+
+## Examples
+
+### Thinking text accumulator
+
+```tsx
+function ThinkingDisplay({ messageId }: { messageId: string }) {
+  const [thinking, setThinking] = useState('')
+
+  useCopilotEvent('thinking:delta', (e) => {
+    if (e.messageId !== messageId) return
+    setThinking(prev => prev + e.content)
+  })
+
+  if (!thinking) return null
+  return <div className="thinking-box">{thinking}</div>
+}
+```
+
+---
+
+### Tool execution badge
+
+Show a live "Searching…" indicator while a tool runs:
+
+```tsx
+function ToolBadge() {
+  const [activeTool, setActiveTool] = useState<string | null>(null)
+
+  useCopilotEvent('action:start', (e) => setActiveTool(e.name))
+  useCopilotEvent('action:end',   (e) => setActiveTool(null))
+
+  if (!activeTool) return null
+  return (
+    <div className="tool-badge">
+      <Spinner /> {activeTool.replace(/_/g, ' ')}
+    </div>
+  )
+}
+```
+
+---
+
+### Agent loop progress bar
+
+```tsx
+function LoopProgress() {
+  const [progress, setProgress] = useState(0)
+
+  useCopilotEvent('loop:iteration', (e) => {
+    setProgress(e.iteration / e.maxIterations)
+  })
+
+  useCopilotEvent('loop:complete', () => setProgress(0))
+
+  if (!progress) return null
+  return <progress value={progress} max={1} />
+}
+```
+
+---
+
+### Artifact tracking
+
+Parse `create_artifact` tool results and store them per message:
+
+```tsx
+function useArtifactTracker() {
+  const { messageMeta } = useCopilot()
+
+  useCopilotEvent('action:end', (e) => {
+    if (e.name !== 'create_artifact' || !e.result || !e.messageId) return
+    messageMeta.updateMeta(e.messageId, prev => ({
+      ...prev,
+      artifacts: [...((prev.artifacts as unknown[]) ?? []), e.result]
+    }))
+  })
+}
+```
+
+---
+
+### Catch-all debug logger
+
+```tsx
+useCopilotEvent('*', (e) => {
+  console.log(`[stream] ${e.type}`, e)
+})
+```
+
+---
+
+### Writing to message meta from event handlers
+
+For writing metadata while streaming (before the message component mounts), use the `messageMeta` store directly from `useCopilot()`:
+
+```tsx
+const { messageMeta } = useCopilot()
+
+useCopilotEvent('thinking:delta', (e) => {
+  if (!e.messageId) return
+  messageMeta.updateMeta(e.messageId, prev => ({
+    ...prev,
+    thinking: ((prev.thinking as string) ?? '') + e.content
+  }))
+})
+```
+
+Then read it in your message component with [`useMessageMeta`](/docs/headless/use-message-meta).
+
+---
+
+## Notes
+
+- Handlers run **synchronously** during the streaming loop — keep them fast. Defer expensive work with `setTimeout` or `startTransition` if needed.
+- The hook is a no-op outside of a `CopilotProvider`.
+- Multiple components can subscribe to the same event type independently.
diff --git a/apps/docs/content/docs/headless/use-message-meta.mdx b/apps/docs/content/docs/headless/use-message-meta.mdx
new file mode 100644
index 0000000..3a5fee0
--- /dev/null
+++ b/apps/docs/content/docs/headless/use-message-meta.mdx
@@ -0,0 +1,230 @@
+---
+title: useMessageMeta
+description: Reactive per-message custom metadata — attach any data to a message ID and react to changes
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+`useMessageMeta` is a reactive per-message key-value store. Attach any shape of data to a message ID — any component reading that message ID will re-render when the data changes.
+
+```ts
+import { useMessageMeta } from '@yourgpt/copilot-sdk/react'
+```
+
+It's the storage layer for the headless system. Pair it with [`useCopilotEvent`](/docs/headless/use-copilot-event) to write data during streaming, then read it in your message components.
+
+---
+
+## Signature
+
+```ts
+function useMessageMeta<T extends Record<string, unknown>>(
+  messageId: string | undefined
+): UseMessageMetaReturn<T>
+```
+
+### Parameters
+
+| Parameter | Type | Description |
+|---|---|---|
+| `messageId` | `string \| undefined` | The message to attach metadata to. Pass `undefined` for a no-op instance (safe for conditional calls) |
+
+### Returns
+
+```ts
+interface UseMessageMetaReturn<T> {
+  /** Current metadata. Empty object if nothing set yet. */
+  meta: T
+
+  /** Replace metadata entirely */
+  setMeta: (meta: T) => void
+
+  /** Update metadata with an updater function */
+  updateMeta: (updater: (prev: T) => T) => void
+}
+```
+
+---
+
+## Examples
+
+### Thinking steps
+
+Define your own shape — the SDK doesn't dictate it:
+
+```tsx
+interface MyMeta {
+  thinking?: string
+  isThinking?: boolean
+}
+
+// Writer — inside useCopilotEvent handler
+const { messageMeta } = useCopilot()
+
+useCopilotEvent('thinking:delta', (e) => {
+  if (!e.messageId) return
+  messageMeta.updateMeta(e.messageId, prev => ({
+    ...prev,
+    thinking: ((prev.thinking as string) ?? '') + e.content,
+    isThinking: true,
+  }))
+})
+
+useCopilotEvent('message:end', (e) => {
+  messageMeta.updateMeta(e.messageId!, prev => ({
+    ...prev,
+    isThinking: false,
+  }))
+})
+
+// Reader — in your message component
+function AssistantMessage({ message }) {
+  const { meta } = useMessageMeta<MyMeta>(message.id)
+
+  return (
+    <div>
+      {meta.isThinking && <ThinkingIndicator text={meta.thinking} />}
+      <p>{message.content}</p>
+    </div>
+  )
+}
+```
+
+---
+
+### Artifact storage
+
+```tsx
+interface MyMeta {
+  artifacts?: Array<{ type: string; title: string; content: unknown }>
+}
+
+// Writer
+useCopilotEvent('action:end', (e) => {
+  if (e.name !== 'create_artifact' || !e.messageId) return
+  messageMeta.updateMeta(e.messageId, prev => ({
+    ...prev,
+    artifacts: [...((prev.artifacts as unknown[]) ?? []), e.result]
+  }))
+})
+
+// Reader
+function Message({ message }) {
+  const { meta } = useMessageMeta<MyMeta>(message.id)
+
+  return (
+    <div>
+      <p>{message.content}</p>
+      {meta.artifacts?.map((a, i) => (
+        <ArtifactPreview key={i} artifact={a} />
+      ))}
+    </div>
+  )
+}
+```
+
+---
+
+### Plan approval state
+
+```tsx
+interface MyMeta {
+  planStatus?: 'pending' | 'approved' | 'rejected'
+  plan?: { summary: string; steps: Step[] }
+}
+
+// Writer — called from your tool render function
+const { updateMeta } = useMessageMeta<MyMeta>(messageId)
+updateMeta(prev => ({ ...prev, planStatus: 'pending', plan: planData }))
+
+// Reader
+function Message({ message }) {
+  const { meta, updateMeta } = useMessageMeta<MyMeta>(message.id)
+
+  if (meta.planStatus === 'pending') {
+    return (
+      <PlanCard
+        plan={meta.plan}
+        onApprove={() => updateMeta(p => ({ ...p, planStatus: 'approved' }))}
+        onReject={() => updateMeta(p => ({ ...p, planStatus: 'rejected' }))}
+      />
+    )
+  }
+
+  return <p>{message.content}</p>
+}
+```
+
+---
+
+### Tool progress per message
+
+```tsx
+interface MyMeta {
+  activeTools?: Record<string, 'running' | 'done' | 'error'>
+}
+
+useCopilotEvent('action:start', (e) => {
+  if (!e.messageId) return
+  messageMeta.updateMeta(e.messageId, prev => ({
+    ...prev,
+    activeTools: { ...((prev.activeTools as object) ?? {}), [e.name]: 'running' }
+  }))
+})
+
+useCopilotEvent('action:end', (e) => {
+  if (!e.messageId) return
+  messageMeta.updateMeta(e.messageId, prev => ({
+    ...prev,
+    activeTools: {
+      ...((prev.activeTools as object) ?? {}),
+      [e.name]: e.error ? 'error' : 'done'
+    }
+  }))
+})
+
+// Reader
+function Message({ message }) {
+  const { meta } = useMessageMeta<MyMeta>(message.id)
+  const running = Object.entries(meta.activeTools ?? {}).filter(([, v]) => v === 'running')
+
+  return (
+    <div>
+      {running.map(([name]) => <ToolBadge key={name} name={name} />)}
+      <p>{message.content}</p>
+    </div>
+  )
+}
+```
+
+---
+
+## Using `messageMeta` directly
+
+For writing from event handlers (where a hook can't be called), access the store directly from `useCopilot()`:
+
+```tsx
+const { messageMeta } = useCopilot()
+
+// Read
+const meta = messageMeta.getMeta(messageId)
+
+// Write
+messageMeta.setMeta(messageId, { myKey: 'value' })
+
+// Update
+messageMeta.updateMeta(messageId, prev => ({ ...prev, count: (prev.count as number ?? 0) + 1 }))
+```
+
+<Callout type="info">
+`messageMeta` is the same store instance that `useMessageMeta` reads from — writing via `messageMeta.updateMeta()` will cause all `useMessageMeta(messageId)` consumers for that ID to re-render.
+</Callout>
+
+---
+
+## Notes
+
+- `meta` is always an object — never `null` or `undefined`. It starts as `{}`.
+- Metadata is **in-memory only** — it resets when the provider unmounts. For persistence, sync to your own storage inside event handlers.
+- Passing `undefined` as `messageId` returns a no-op instance — safe to call unconditionally.
+- Multiple components can read the same `messageId` — all will re-render on any write.
diff --git a/apps/docs/content/docs/meta.json b/apps/docs/content/docs/meta.json
index fe80033..e340fe4 100644
--- a/apps/docs/content/docs/meta.json
+++ b/apps/docs/content/docs/meta.json
@@ -14,6 +14,7 @@
     "skills",
     "generative-ui",
     "customizations",
+    "headless",
     "advanced",
     "---LLM SDK---",
     "llm-sdk",
diff --git a/apps/docs/content/docs/providers/meta.json b/apps/docs/content/docs/providers/meta.json
index 4dc98f8..2dd320e 100644
--- a/apps/docs/content/docs/providers/meta.json
+++ b/apps/docs/content/docs/providers/meta.json
@@ -8,6 +8,7 @@
     "xai",
     "openrouter",
     "fireworks",
+    "togetherai",
     "ollama",
     "custom-provider",
     "fallback"
diff --git a/apps/docs/content/docs/providers/togetherai.mdx b/apps/docs/content/docs/providers/togetherai.mdx
new file mode 100644
index 0000000..320484e
--- /dev/null
+++ b/apps/docs/content/docs/providers/togetherai.mdx
@@ -0,0 +1,171 @@
+---
+title: Together AI
+description: Cost-effective open-source model inference — Llama, DeepSeek, Qwen, Gemma and more
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+[Together AI](https://together.ai) is a high-performance inference platform for open-source models. It offers fast, scalable serving for Llama, DeepSeek, Qwen, Gemma, Mistral and many others through an OpenAI-compatible API.
+
+---
+
+## Setup
+
+### 1. Install packages
+
+```bash
+npm install @yourgpt/copilot-sdk @yourgpt/llm-sdk openai
+```
+
+<Callout type="info">
+Together AI uses an OpenAI-compatible API, so the `openai` package is the only peer dependency needed.
+</Callout>
+
+### 2. Get API key
+
+Sign up and get your API key at [api.together.xyz/settings/api-keys](https://api.together.xyz/settings/api-keys).
+
+### 3. Add environment variable
+
+```bash title=".env.local"
+TOGETHER_API_KEY=your-key-here
+```
+
+### 4. Streaming API route
+
+```ts title="app/api/chat/route.ts"
+import { streamText } from '@yourgpt/llm-sdk';
+import { togetherai } from '@yourgpt/llm-sdk/togetherai';
+
+export async function POST(req: Request) {
+  const { messages } = await req.json();
+
+  const result = await streamText({
+    model: togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo'),
+    system: 'You are a helpful assistant.',
+    messages,
+  });
+
+  return result.toTextStreamResponse();
+}
+```
+
+### 5. Generate text
+
+```ts
+import { generateText } from '@yourgpt/llm-sdk';
+import { togetherai } from '@yourgpt/llm-sdk/togetherai';
+
+const result = await generateText({
+  model: togetherai('deepseek-ai/DeepSeek-V3'),
+  prompt: 'Explain quantum entanglement simply.',
+});
+
+console.log(result.text);
+```
+
+---
+
+## Available Models
+
+```ts
+// DeepSeek
+togetherai('deepseek-ai/DeepSeek-V3')      // 128K ctx, tools
+togetherai('deepseek-ai/DeepSeek-V3.1')     // 128K ctx, tools
+togetherai('deepseek-ai/DeepSeek-R1')       // reasoning model
+
+// Llama
+togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo')  // 131K ctx, fast
+togetherai('meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo')  // 130K ctx
+togetherai('meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo')
+togetherai('meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo')
+
+// Qwen
+togetherai('Qwen/Qwen3.5-397B-A17B')       // 262K ctx
+togetherai('Qwen/Qwen3.5-9B')
+
+// Gemma
+togetherai('google/gemma-4-31B-it')
+
+// Kimi
+togetherai('moonshotai/Kimi-K2.5')          // 262K ctx
+
+// GLM
+togetherai('zai-org/GLM-5.1')              // 202K ctx
+```
+
+Any model ID listed on [together.ai/models](https://api.together.xyz/models) works.
+
+---
+
+## Configuration
+
+```ts
+import { togetherai } from '@yourgpt/llm-sdk/togetherai';
+
+// Explicit API key
+const model = togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo', {
+  apiKey: 'your-key',
+});
+
+// Custom base URL (e.g. self-hosted or proxy)
+const model = togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo', {
+  baseURL: 'https://my-proxy.example.com/v1',
+});
+```
+
+---
+
+## Tool Calling
+
+Many Together AI models support tool calling:
+
+```ts
+import { generateText, tool } from '@yourgpt/llm-sdk';
+import { togetherai } from '@yourgpt/llm-sdk/togetherai';
+import { z } from 'zod';
+
+const result = await generateText({
+  model: togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo'),
+  prompt: 'What is the weather in Miami?',
+  tools: {
+    getWeather: tool({
+      description: 'Get weather for a city',
+      parameters: z.object({ city: z.string() }),
+      execute: async ({ city }) => ({ temperature: 82, condition: 'sunny' }),
+    }),
+  },
+  maxSteps: 5,
+});
+```
+
+<Callout type="info">
+`deepseek-ai/DeepSeek-R1` is a reasoning model and does not support tool calling. Use `DeepSeek-V3` or a Llama model for tool use.
+</Callout>
+
+---
+
+## With Copilot UI
+
+```tsx title="app/providers.tsx"
+'use client';
+
+import { CopilotProvider } from '@yourgpt/copilot-sdk/react';
+
+export function Providers({ children }: { children: React.ReactNode }) {
+  return (
+    <CopilotProvider runtimeUrl="/api/chat">
+      {children}
+    </CopilotProvider>
+  );
+}
+```
+
+---
+
+## Next Steps
+
+- [Fireworks](/docs/providers/fireworks) - Another fast open-source model platform
+- [OpenRouter](/docs/providers/openrouter) - Access 500+ models with one API key
+- [Fallback Chain](/docs/providers/fallback) - Automatic failover between providers
+- [generateText()](/docs/llm-sdk) - Full LLM SDK reference
diff --git a/apps/docs/content/docs/skills.mdx b/apps/docs/content/docs/skills.mdx
deleted file mode 100644
index 1f7b48d..0000000
--- a/apps/docs/content/docs/skills.mdx
+++ /dev/null
@@ -1,174 +0,0 @@
----
-title: Skills
-description: Instruction playbooks the AI loads on demand — keeps the system prompt lean
-icon: AiBook
----
-
-import { Callout } from 'fumadocs-ui/components/callout';
-import { Tab, Tabs } from 'fumadocs-ui/components/tabs';
-
-Skills are Markdown instruction sets the AI loads at runtime. Instead of putting everything in the system prompt, skills are injected only when relevant.
-
-| Strategy | Behaviour |
-|----------|-----------|
-| `eager` | Always injected into every request |
-| `auto` | Listed in a catalog — AI calls `load_skill` to fetch when relevant |
-| `manual` | Available via `load_skill` but not advertised |
-
----
-
-## Adding Skills
-
-<Tabs items={['React (inline)', 'Server (files / URLs)', 'Both together']}>
-  <Tab value="React (inline)">
-
-Use `<SkillProvider>` and `useSkill` inside your React app. Supports `inline` content only.
-
-```tsx
-import { SkillProvider, defineSkill, useSkill } from "@yourgpt/copilot-sdk/react";
-
-const brandVoice = defineSkill({
-  name: "brand-voice",
-  description: "Ensures responses match our brand tone",
-  strategy: "eager",
-  source: { type: "inline", content: "Always respond in a friendly, concise tone." },
-});
-
-// App-level — always active
-export default function App() {
-  return (
-    <CopilotProvider widgetToken="...">
-      <SkillProvider skills={[brandVoice]}>
-        <YourApp />
-      </SkillProvider>
-    </CopilotProvider>
-  );
-}
-
-// Page-level — active only while this component is mounted
-function CheckoutPage() {
-  useSkill({
-    name: "checkout-flow",
-    description: "Guides the user through checkout",
-    strategy: "auto",
-    source: { type: "inline", content: "1. Confirm cart  2. Check promo codes  3. Shipping..." },
-  });
-
-  return <CheckoutUI />;
-}
-```
-
-<Callout type="info">
-`useSkill` auto-unregisters when the component unmounts — great for route-scoped skills.
-</Callout>
-
-  </Tab>
-  <Tab value="Server (files / URLs)">
-
-Use `loadSkills()` in your API route. Supports `inline`, `file` (local `.md` files), and `url` (remote `.md`) sources.
-
-```
-skills/
-├── brand-voice.md     # eager — always active
-└── sql-expert.md      # auto — loaded on demand
-```
-
-```typescript
-// app/api/chat/route.ts
-import path from "path";
-import { loadSkills } from "@yourgpt/copilot-sdk/server";
-
-export async function POST(req: Request) {
-  const { messages } = await req.json();
-
-  const { buildSystemPrompt, tools } = await loadSkills({
-    dir: path.join(process.cwd(), "skills"),                // local .md files
-    remoteUrls: ["https://cdn.myapp.com/skills/policy.md"], // remote URLs
-  });
-
-  return runtime.stream({
-    system: buildSystemPrompt("You are a helpful assistant."),
-    messages,
-    tools,
-  });
-}
-```
-
-  </Tab>
-  <Tab value="Both together">
-
-`<SkillProvider>` automatically forwards client skills to the server via `__skills`. Pass them to `loadSkills` to merge with server-side skills.
-
-```typescript
-export async function POST(req: Request) {
-  const { messages, __skills } = await req.json();
-
-  const { buildSystemPrompt, tools } = await loadSkills({
-    dir: path.join(process.cwd(), "skills"),  // server files take precedence
-    clientSkills: __skills ?? [],             // client inline skills merged in
-  });
-
-  return runtime.stream({ system: buildSystemPrompt("..."), messages, tools });
-}
-```
-
-**Source precedence** (highest → lowest):
-```
-server-dir  >  remote-url  >  client-inline
-```
-
-  </Tab>
-</Tabs>
-
----
-
-## defineSkill
-
-Type-safe helper for creating reusable skill definitions:
-
-```typescript
-const mySkill = defineSkill({
-  name: "api-docs-helper",
-  description: "Helps users understand the Acme API",
-  strategy: "auto",
-  source: { type: "inline", content: "When explaining endpoints, include example requests..." },
-});
-
-// Reuse anywhere
-<SkillProvider skills={[mySkill]} />
-```
-
----
-
-## useSkillStatus
-
-Observe the live skill registry from any component:
-
-```tsx
-const { skills, count, has } = useSkillStatus();
-
-// count            — number of active skills
-// has("name")      — check if a skill is active
-// skills           — full list of ResolvedSkill[]
-```
-
----
-
-## Type Reference
-
-```typescript
-type SkillStrategy = "eager" | "auto" | "manual";
-
-type SkillSource =
-  | { type: "inline"; content: string }
-  | { type: "file"; path: string }       // server only
-  | { type: "url"; url: string };        // server only
-
-interface SkillDefinition {
-  name: string;
-  description: string;
-  source: SkillSource;
-  strategy?: SkillStrategy;  // default: "auto"
-  version?: string;
-}
-```
diff --git a/apps/docs/content/docs/skills/client.mdx b/apps/docs/content/docs/skills/client.mdx
new file mode 100644
index 0000000..392a358
--- /dev/null
+++ b/apps/docs/content/docs/skills/client.mdx
@@ -0,0 +1,172 @@
+---
+title: Client-side Skills
+description: Register skills from React components using SkillProvider and useSkill
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+<Callout type="warn">
+**Beta** — This feature is in **alpha**. APIs may change before stable release.
+</Callout>
+
+Register skills from your React app using `<SkillProvider>`, `useSkill`, and `defineSkill`.
+
+<Callout type="info">
+Client-side skills only support `inline` source. For `file` or `url` sources, use [server-side skills](/docs/skills/server).
+</Callout>
+
+---
+
+## SkillProvider
+
+Wrap your app inside `<CopilotProvider>` to enable client-side skills:
+
+```tsx
+import { SkillProvider, defineSkill } from "@yourgpt/copilot-sdk/react";
+
+const brandVoice = defineSkill({
+  name: "brand-voice",
+  description: "Ensures responses match our brand tone and terminology",
+  strategy: "eager",
+  source: {
+    type: "inline",
+    content: "Always respond in a friendly, concise tone. Use 'we' not 'I'. Avoid jargon.",
+  },
+});
+
+const codeReview = defineSkill({
+  name: "code-review",
+  description: "Performs structured code reviews with actionable feedback",
+  strategy: "auto",
+  source: {
+    type: "inline",
+    content: "When reviewing code: 1) Check for bugs first...",
+  },
+});
+
+export default function App() {
+  return (
+    <CopilotProvider widgetToken="...">
+      <SkillProvider skills={[brandVoice, codeReview]}>
+        <YourApp />
+      </SkillProvider>
+    </CopilotProvider>
+  );
+}
+```
+
+---
+
+## useSkill
+
+Register a skill from deep inside the component tree. It activates on mount and cleans up on unmount — useful for page-scoped skills.
+
+```tsx
+import { useSkill } from "@yourgpt/copilot-sdk/react";
+
+function CheckoutPage() {
+  useSkill({
+    name: "checkout-flow",
+    description: "Guides the user through the checkout process step by step",
+    strategy: "auto",
+    source: {
+      type: "inline",
+      content: `
+## Checkout Assistant
+
+When the user asks about checkout:
+1. Confirm their cart items
+2. Check for applicable promo codes
+3. Walk through shipping options
+4. Confirm payment method before submitting
+      `,
+    },
+  });
+
+  return <CheckoutUI />;
+}
+```
+
+The skill is automatically unregistered when `CheckoutPage` unmounts.
+
+<Callout type="info">
+If an inline skill exceeds 2000 characters in development, a console warning is shown. Large inline skills are sent on every request — consider using a server-side file skill instead.
+</Callout>
+
+---
+
+## defineSkill
+
+Type-safe factory for creating skill definitions. An identity function with TypeScript inference:
+
+```typescript
+import { defineSkill } from "@yourgpt/copilot-sdk/react";
+
+const mySkill = defineSkill({
+  name: "api-docs-helper",
+  description: "Helps users understand and use the Acme API",
+  strategy: "auto",
+  version: "2.0.0",
+  source: {
+    type: "inline",
+    content: "When explaining API endpoints, always include example requests...",
+  },
+});
+
+// Reuse in multiple providers
+<SkillProvider skills={[mySkill]} />
+```
+
+---
+
+## useSkillStatus
+
+Observe the live skill registry from any component inside `<SkillProvider>`:
+
+```tsx
+import { useSkillStatus } from "@yourgpt/copilot-sdk/react";
+
+function DebugPanel() {
+  const { skills, count, has } = useSkillStatus();
+
+  return (
+    <div>
+      <p>{count} skill(s) active</p>
+      {has("code-review") && <Badge>Code Review</Badge>}
+      <ul>
+        {skills.map((s) => (
+          <li key={s.name}>
+            {s.name} ({s.strategy ?? "auto"})
+          </li>
+        ))}
+      </ul>
+    </div>
+  );
+}
+```
+
+### Return type
+
+```typescript
+interface UseSkillStatusReturn {
+  skills: ResolvedSkill[];          // All currently registered skills
+  count: number;                    // Number of registered skills
+  has: (name: string) => boolean;   // Check if a named skill is active
+}
+```
+
+---
+
+## Type Reference
+
+```typescript
+type SkillStrategy = "eager" | "auto" | "manual";
+
+interface SkillDefinition {
+  name: string;
+  description: string;
+  source: { type: "inline"; content: string };
+  strategy?: SkillStrategy; // default: "auto"
+  version?: string;
+}
+```
diff --git a/apps/docs/content/docs/skills/index.mdx b/apps/docs/content/docs/skills/index.mdx
new file mode 100644
index 0000000..da9f77e
--- /dev/null
+++ b/apps/docs/content/docs/skills/index.mdx
@@ -0,0 +1,130 @@
+---
+title: Skills
+description: Instruction playbooks that shape the AI's behavior — loaded on demand
+icon: AiBook
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+<Callout type="warn">
+**Beta** — This feature is in **alpha**. APIs may change before stable release.
+</Callout>
+
+Skills are instruction playbooks the AI loads on demand. They shape the model's **behavior** — separate from [Tools](/docs/tools), which perform actions.
+
+A skill is a Markdown file (or inline string) containing instructions. The AI only loads a skill when it's relevant to the user's query — keeping the system prompt lean.
+
+---
+
+## Strategies
+
+| Strategy | Behavior |
+|----------|----------|
+| `eager` | Content prepended to system prompt on every request. Always active. |
+| `auto` | Listed in the skill catalog. AI calls `load_skill` to retrieve when relevant. |
+| `manual` | Accessible via `load_skill` but not advertised in the catalog. For internal/conditional skills. |
+
+The `load_skill` tool is automatically registered when a `<SkillProvider>` is present (client) or when `loadSkills()` builds the tools object (server). **No manual wiring required.**
+
+---
+
+## Skills vs Tools
+
+| | Skills | Tools |
+|--|--------|-------|
+| **Purpose** | Shape behavior | Perform actions |
+| **Payload** | Markdown instructions | Code that runs |
+| **Loading** | Into system prompt | Into tool list |
+| **Example** | "Always respond in formal English" | `get_weather({ city })` |
+
+---
+
+## Quick Example
+
+```tsx
+import { SkillProvider, defineSkill } from "@yourgpt/copilot-sdk/react";
+
+const brandVoice = defineSkill({
+  name: "brand-voice",
+  description: "Ensures responses match our brand tone",
+  strategy: "eager",  // always active
+  source: {
+    type: "inline",
+    content: "Always respond in a friendly, concise tone. Use 'we' not 'I'.",
+  },
+});
+
+const sqlExpert = defineSkill({
+  name: "sql-expert",
+  description: "Writes and explains SQL queries",
+  strategy: "auto",   // AI loads when the user asks about SQL
+  source: {
+    type: "inline",
+    content: "When writing SQL: always use parameterized queries...",
+  },
+});
+
+export default function App() {
+  return (
+    <CopilotProvider widgetToken="...">
+      <SkillProvider skills={[brandVoice, sqlExpert]}>
+        <YourApp />
+      </SkillProvider>
+    </CopilotProvider>
+  );
+}
+```
+
+---
+
+## Skill File Format
+
+Skill files are Markdown with optional YAML frontmatter:
+
+```markdown
+---
+name: code-review
+description: Performs structured code reviews with actionable feedback
+strategy: auto
+version: 1.2.0
+---
+
+## Code Review Instructions
+
+When asked to review code, follow this structure:
+
+1. **Correctness** — Check for logic errors and edge cases
+2. **Security** — Flag injection risks, exposed secrets, insecure defaults
+3. **Performance** — Note O(n²) loops, unnecessary re-renders, missing indexes
+4. **Style** — Suggest naming and structure improvements (non-blocking)
+```
+
+### Frontmatter fields
+
+| Field | Required | Description |
+|-------|----------|-------------|
+| `name` | Recommended | Skill name. Derived from filename if omitted. |
+| `description` | Recommended | One-line description shown in the AI's skill catalog. |
+| `strategy` | No | `eager`, `auto`, or `manual`. Default: `auto`. |
+| `version` | No | Informational version string. |
+
+---
+
+## Directory Layout
+
+```
+skills/
+├── brand-voice.md       # Flat .md file
+├── code-review.md
+└── sql-expert/
+    └── SKILL.md         # Folder-based skill
+```
+
+For folder-based skills, place the main content at `<folder>/SKILL.md`. The folder name is used as the skill name unless overridden by frontmatter.
+
+---
+
+## Next Steps
+
+- [Client-side Skills](/docs/skills/client) — `SkillProvider`, `useSkill`, `useSkillStatus`
+- [Server-side Skills](/docs/skills/server) — `loadSkills()`, file/URL sources, collision detection
diff --git a/apps/docs/content/docs/skills/meta.json b/apps/docs/content/docs/skills/meta.json
new file mode 100644
index 0000000..e765055
--- /dev/null
+++ b/apps/docs/content/docs/skills/meta.json
@@ -0,0 +1,5 @@
+{
+  "title": "Skills",
+  "icon": "AiBook",
+  "pages": ["client", "server"]
+}
diff --git a/apps/docs/content/docs/skills/server.mdx b/apps/docs/content/docs/skills/server.mdx
new file mode 100644
index 0000000..817d532
--- /dev/null
+++ b/apps/docs/content/docs/skills/server.mdx
@@ -0,0 +1,183 @@
+---
+title: Server-side Skills
+description: Load skills from files and URLs on the server with loadSkills()
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+<Callout type="warn">
+**Beta** — This feature is in **alpha**. APIs may change before stable release.
+</Callout>
+
+For `file` and `url` skill sources, or when you need server-controlled skill loading, use `loadSkills()` in your API route.
+
+---
+
+## Basic Setup
+
+```typescript
+// app/api/chat/route.ts
+import path from "path";
+import { loadSkills } from "@yourgpt/copilot-sdk/server";
+
+export async function POST(req: Request) {
+  const { messages, __skills } = await req.json();
+
+  const { buildSystemPrompt, tools } = await loadSkills({
+    // Source 1: .md files from a local directory (highest precedence)
+    dir: path.join(process.cwd(), "skills"),
+
+    // Source 2: Remote .md URLs
+    remoteUrls: ["https://cdn.myapp.com/skills/support-policy.md"],
+
+    // Source 3: Inline skills forwarded from client (lowest precedence)
+    clientSkills: __skills ?? [],
+  });
+
+  return streamText({
+    model: anthropic("claude-sonnet-4-6"),
+    system: buildSystemPrompt("You are a helpful assistant for Acme Corp."),
+    messages,
+    tools: {
+      ...tools,         // includes load_skill automatically
+      ...myOtherTools,
+    },
+  }).toDataStreamResponse();
+}
+```
+
+---
+
+## loadSkills Options
+
+```typescript
+interface LoadSkillsOptions {
+  dir?: string;                     // Path to /skills directory (Node.js only)
+  remoteUrls?: string[];            // Remote .md URLs to fetch
+  clientSkills?: ClientInlineSkill[]; // Forwarded from useSkill() hooks
+}
+```
+
+## loadSkills Result
+
+```typescript
+interface LoadSkillsResult {
+  skills: ResolvedSkill[];
+  diagnostics: SkillDiagnostic[];
+
+  // Build system prompt: prepends eager content, appends auto catalog
+  buildSystemPrompt(basePrompt?: string): string;
+
+  // Ready-to-use load_skill tool definition
+  tools: {
+    load_skill: ToolDefinition;
+  };
+}
+```
+
+---
+
+## Forwarding Client Skills
+
+`<SkillProvider>` automatically syncs inline skills to `CopilotProvider`, which includes them in every API request as `__skills`. Read them in your route:
+
+```typescript
+const { messages, __skills } = await req.json();
+
+const { buildSystemPrompt, tools } = await loadSkills({
+  dir: path.join(process.cwd(), "skills"),
+  clientSkills: __skills ?? [],
+});
+```
+
+---
+
+## Source Precedence & Collision Detection
+
+When the same skill name appears in multiple sources, the higher-precedence source wins:
+
+```
+server-dir  >  remote-url  >  client-inline
+```
+
+```typescript
+const { diagnostics } = await loadSkills({ ... });
+
+// [{
+//   type: "collision",
+//   name: "code-review",
+//   winner: "server-dir",
+//   loser: "client-inline",
+// }]
+if (diagnostics.length) {
+  console.warn("Skill collisions:", diagnostics);
+}
+```
+
+This lets you safely override client-provided skills with authoritative server versions.
+
+---
+
+## Full Example
+
+### Project structure
+
+```
+skills/
+├── brand-voice.md     # eager — always active
+└── sql-expert.md      # auto — loaded on demand
+```
+
+### API route
+
+```typescript
+// app/api/chat/route.ts
+import path from "path";
+import { loadSkills } from "@yourgpt/copilot-sdk/server";
+import { streamText } from "ai";
+import { anthropic } from "@ai-sdk/anthropic";
+
+export async function POST(req: Request) {
+  const { messages, __skills } = await req.json();
+
+  const { buildSystemPrompt, tools } = await loadSkills({
+    dir: path.join(process.cwd(), "skills"),
+    clientSkills: __skills ?? [],
+  });
+
+  return streamText({
+    model: anthropic("claude-sonnet-4-6"),
+    system: buildSystemPrompt("You are a helpful assistant for Acme Corp."),
+    messages,
+    tools,
+  }).toDataStreamResponse();
+}
+```
+
+---
+
+## Type Reference
+
+```typescript
+type SkillStrategy = "eager" | "auto" | "manual";
+
+type SkillSource =
+  | { type: "inline"; content: string }
+  | { type: "url"; url: string }
+  | { type: "file"; path: string };
+
+interface ResolvedSkill {
+  name: string;
+  description: string;
+  content: string;
+  strategy?: SkillStrategy;
+  version?: string;
+}
+
+interface SkillDiagnostic {
+  type: "collision";
+  name: string;
+  winner: "server-dir" | "remote-url" | "client-inline";
+  loser: "server-dir" | "remote-url" | "client-inline";
+}
+```
diff --git a/apps/docs/content/docs/tools/agentic-loop.mdx b/apps/docs/content/docs/tools/agentic-loop.mdx
index ae9b862..4fb6962 100644
--- a/apps/docs/content/docs/tools/agentic-loop.mdx
+++ b/apps/docs/content/docs/tools/agentic-loop.mdx
@@ -455,6 +455,47 @@ Listen to agentic loop events:
 
 ---
 
+## AbstractAgentLoop (Framework-Agnostic)
+
+For non-React or custom setups, use `AbstractAgentLoop` directly to manage tool execution, approvals, and cancellation.
+
+```typescript
+import { AbstractAgentLoop } from "@yourgpt/copilot-sdk";
+
+const loop = new AbstractAgentLoop(
+  {
+    maxIterations: 20,
+    tools: [myTool],
+  },
+  {
+    onToolExecutionsChange: (executions) => setExecutions(executions),
+    onToolApprovalRequired: (execution) => showApprovalModal(execution),
+  },
+);
+
+// Register/unregister tools at runtime
+loop.registerTool(weatherTool);
+loop.unregisterTool("old_tool");
+
+// Execute tool calls returned by the LLM
+const results = await loop.executeToolCalls(toolCallsFromLLM);
+
+// Cancel in-flight execution
+loop.cancel();
+```
+
+```typescript
+interface AgentLoopConfig {
+  maxIterations?: number;        // default: 20
+  maxExecutionHistory?: number;  // default: 100
+  tools?: ToolDefinition[];
+}
+```
+
+Tools use reference counting so React StrictMode double-invocations don't leave orphaned registrations.
+
+---
+
 ## Next Steps
 
 - [Tool Approval](/docs/tool-approval) - Add human confirmation
diff --git a/apps/docs/next.config.mjs b/apps/docs/next.config.mjs
index 44b73c1..b001bcb 100644
--- a/apps/docs/next.config.mjs
+++ b/apps/docs/next.config.mjs
@@ -55,12 +55,6 @@ const config = {
         destination: "/docs/chat/storage/session",
         permanent: true,
       },
-      // ── headless/ → customizations/headless ──────────────────────────────
-      {
-        source: "/docs/headless",
-        destination: "/docs/customizations/headless",
-        permanent: true,
-      },
       // ── tools subpages removed ───────────────────────────────────────────
       {
         source: "/docs/tools/deferred-tools",
@@ -88,17 +82,6 @@ const config = {
         destination: "/docs/chat/ui",
         permanent: true,
       },
-      // ── skills subpages removed ──────────────────────────────────────────
-      {
-        source: "/docs/skills/client",
-        destination: "/docs/skills",
-        permanent: true,
-      },
-      {
-        source: "/docs/skills/server",
-        destination: "/docs/skills",
-        permanent: true,
-      },
     ];
   },
   async rewrites() {
diff --git a/examples/playground/package.json b/examples/playground/package.json
index 57d9aa9..8f2c944 100644
--- a/examples/playground/package.json
+++ b/examples/playground/package.json
@@ -27,8 +27,8 @@
     "@radix-ui/react-switch": "^1.2.3",
     "@radix-ui/react-tabs": "^1.1.13",
     "@tailwindcss/typography": "^0.5.19",
-    "@yourgpt/copilot-sdk": "workspace:*",
-    "@yourgpt/llm-sdk": "workspace:*",
+    "@yourgpt/copilot-sdk": "^2.1.8",
+    "@yourgpt/llm-sdk": "^2.1.8",
     "class-variance-authority": "^0.7.1",
     "clsx": "^2.1.1",
     "cmdk": "^1.1.1",
diff --git a/examples/togetherai-demo/.env.example b/examples/togetherai-demo/.env.example
new file mode 100644
index 0000000..7f54a06
--- /dev/null
+++ b/examples/togetherai-demo/.env.example
@@ -0,0 +1,3 @@
+# Together AI API Key
+# Get your key at https://api.together.xyz/settings/api-keys
+TOGETHER_API_KEY=your-key-here
diff --git a/examples/togetherai-demo/app/api/chat/route.ts b/examples/togetherai-demo/app/api/chat/route.ts
new file mode 100644
index 0000000..c19fbf8
--- /dev/null
+++ b/examples/togetherai-demo/app/api/chat/route.ts
@@ -0,0 +1,63 @@
+import { createRuntime } from "@yourgpt/llm-sdk";
+import { createTogetherAI } from "@yourgpt/llm-sdk/togetherai";
+import { DEFAULT_MODEL } from "@/lib/models";
+
+const SYSTEM_PROMPT = `You are a helpful AI assistant powered by Together AI.
+You have access to many different open-source AI models and can help with a wide variety of tasks.
+Be concise, helpful, and friendly in your responses.`;
+
+export async function POST(request: Request) {
+  try {
+    const url = new URL(request.url);
+
+    // Get model from query param
+    const model = url.searchParams.get("model") || DEFAULT_MODEL;
+
+    // Get API key from environment
+    const apiKey = process.env.TOGETHER_API_KEY;
+
+    if (!apiKey) {
+      return Response.json(
+        {
+          error:
+            "Together AI API key not configured. Set TOGETHER_API_KEY in .env.local",
+        },
+        { status: 401 },
+      );
+    }
+
+    // Create Together AI provider
+    const together = createTogetherAI({ apiKey });
+
+    // Create runtime with the selected model
+    const runtime = createRuntime({
+      provider: together,
+      model,
+      systemPrompt: SYSTEM_PROMPT,
+      debug: process.env.NODE_ENV === "development",
+    });
+
+    const response = await runtime.handleRequest(request);
+    return response;
+  } catch (error) {
+    console.error("[Chat Route] Error:", error);
+    return Response.json(
+      { error: error instanceof Error ? error.message : "Unknown error" },
+      { status: 500 },
+    );
+  }
+}
+
+export async function GET(request: Request) {
+  const url = new URL(request.url);
+  const model = url.searchParams.get("model") || DEFAULT_MODEL;
+
+  const hasEnvKey = !!process.env.TOGETHER_API_KEY;
+
+  return Response.json({
+    status: "ok",
+    provider: "togetherai",
+    model,
+    configured: hasEnvKey,
+  });
+}
diff --git a/examples/togetherai-demo/app/globals.css b/examples/togetherai-demo/app/globals.css
new file mode 100644
index 0000000..f330389
--- /dev/null
+++ b/examples/togetherai-demo/app/globals.css
@@ -0,0 +1,86 @@
+@import "tailwindcss";
+@import "tw-animate-css";
+
+/* Include SDK package for Tailwind class detection */
+@source "../node_modules/@yourgpt/copilot-sdk/dist/**/*.{js,ts,jsx,tsx}";
+
+@custom-variant dark (&:is(.dark *));
+
+@theme inline {
+  --color-background: var(--background);
+  --color-foreground: var(--foreground);
+  --font-sans: var(--font-geist-sans);
+  --font-mono: var(--font-geist-mono);
+  --color-ring: var(--ring);
+  --color-input: var(--input);
+  --color-border: var(--border);
+  --color-destructive: var(--destructive);
+  --color-accent-foreground: var(--accent-foreground);
+  --color-accent: var(--accent);
+  --color-muted-foreground: var(--muted-foreground);
+  --color-muted: var(--muted);
+  --color-secondary-foreground: var(--secondary-foreground);
+  --color-secondary: var(--secondary);
+  --color-primary-foreground: var(--primary-foreground);
+  --color-primary: var(--primary);
+  --color-popover-foreground: var(--popover-foreground);
+  --color-popover: var(--popover);
+  --color-card-foreground: var(--card-foreground);
+  --color-card: var(--card);
+  --radius-sm: calc(var(--radius) - 4px);
+  --radius-md: calc(var(--radius) - 2px);
+  --radius-lg: var(--radius);
+  --radius-xl: calc(var(--radius) + 4px);
+}
+
+:root {
+  --radius: 0.625rem;
+  --background: oklch(0.985 0.002 247.858);
+  --foreground: oklch(0.145 0 0);
+  --card: oklch(1 0 0);
+  --card-foreground: oklch(0.145 0 0);
+  --popover: oklch(1 0 0);
+  --popover-foreground: oklch(0.145 0 0);
+  --primary: oklch(0.6 0.2 250);
+  --primary-foreground: oklch(0.985 0 0);
+  --secondary: oklch(0.97 0 0);
+  --secondary-foreground: oklch(0.205 0 0);
+  --muted: oklch(0.97 0 0);
+  --muted-foreground: oklch(0.556 0 0);
+  --accent: oklch(0.97 0 0);
+  --accent-foreground: oklch(0.205 0 0);
+  --destructive: oklch(0.577 0.245 27.325);
+  --border: oklch(0.922 0 0);
+  --input: oklch(0.922 0 0);
+  --ring: oklch(0.6 0.2 250);
+}
+
+.dark {
+  --background: oklch(0.145 0 0);
+  --foreground: oklch(0.985 0 0);
+  --card: oklch(0.205 0 0);
+  --card-foreground: oklch(0.985 0 0);
+  --popover: oklch(0.205 0 0);
+  --popover-foreground: oklch(0.985 0 0);
+  --primary: oklch(0.6 0.2 250);
+  --primary-foreground: oklch(0.985 0 0);
+  --secondary: oklch(0.269 0 0);
+  --secondary-foreground: oklch(0.985 0 0);
+  --muted: oklch(0.269 0 0);
+  --muted-foreground: oklch(0.708 0 0);
+  --accent: oklch(0.269 0 0);
+  --accent-foreground: oklch(0.985 0 0);
+  --destructive: oklch(0.704 0.191 22.216);
+  --border: oklch(1 0 0 / 10%);
+  --input: oklch(1 0 0 / 15%);
+  --ring: oklch(0.6 0.2 250);
+}
+
+@layer base {
+  * {
+    @apply border-border outline-ring/50;
+  }
+  body {
+    @apply bg-background text-foreground;
+  }
+}
diff --git a/examples/togetherai-demo/app/layout.tsx b/examples/togetherai-demo/app/layout.tsx
new file mode 100644
index 0000000..a0fca5b
--- /dev/null
+++ b/examples/togetherai-demo/app/layout.tsx
@@ -0,0 +1,35 @@
+import type { Metadata } from "next";
+import { Geist, Geist_Mono } from "next/font/google";
+import "./globals.css";
+
+const geistSans = Geist({
+  variable: "--font-geist-sans",
+  subsets: ["latin"],
+});
+
+const geistMono = Geist_Mono({
+  variable: "--font-geist-mono",
+  subsets: ["latin"],
+});
+
+export const metadata: Metadata = {
+  title: "Together AI Demo - Open-Source Models",
+  description:
+    "Demo showcasing Together AI integration with @yourgpt/copilot-sdk - access open-source models like Llama, DeepSeek, Qwen and more",
+};
+
+export default function RootLayout({
+  children,
+}: Readonly<{
+  children: React.ReactNode;
+}>) {
+  return (
+    <html lang="en" className="dark">
+      <body
+        className={`${geistSans.variable} ${geistMono.variable} antialiased bg-background text-foreground`}
+      >
+        {children}
+      </body>
+    </html>
+  );
+}
diff --git a/examples/togetherai-demo/app/page.tsx b/examples/togetherai-demo/app/page.tsx
new file mode 100644
index 0000000..6859084
--- /dev/null
+++ b/examples/togetherai-demo/app/page.tsx
@@ -0,0 +1,206 @@
+"use client";
+
+import { useState, useMemo, useEffect } from "react";
+import { CopilotProvider } from "@yourgpt/copilot-sdk/react";
+import { CopilotChat } from "@yourgpt/copilot-sdk/ui";
+import { MODEL_GROUPS, ALL_MODELS, DEFAULT_MODEL } from "@/lib/models";
+import {
+  ExternalLink,
+  Github,
+  Terminal,
+  Copy,
+  Check,
+  ChevronDown,
+} from "lucide-react";
+
+export default function TogetherAIDemo() {
+  const [mounted, setMounted] = useState(false);
+  const [selectedModel, setSelectedModel] = useState(DEFAULT_MODEL);
+  const [copied, setCopied] = useState(false);
+
+  useEffect(() => {
+    setMounted(true);
+  }, []);
+
+  const handleCopy = () => {
+    navigator.clipboard.writeText("TOGETHER_API_KEY=your-key-here");
+    setCopied(true);
+    setTimeout(() => setCopied(false), 2000);
+  };
+
+  const runtimeUrl = useMemo(() => {
+    const params = new URLSearchParams();
+    params.set("model", selectedModel);
+    return `/api/chat?${params.toString()}`;
+  }, [selectedModel]);
+
+  const selectedModelInfo = ALL_MODELS.find((m) => m.id === selectedModel);
+
+  if (!mounted) return null;
+
+  return (
+    <div className="dark h-screen flex bg-background text-foreground">
+      {/* Left Sidebar */}
+      <aside className="w-80 flex-none border-r border-border flex flex-col">
+        {/* Header */}
+        <div className="p-5 border-b border-border">
+          <div className="flex items-center gap-3">
+            <div className="flex items-center justify-center w-9 h-9 rounded-lg bg-primary/10">
+              <svg
+                viewBox="0 0 24 24"
+                className="w-5 h-5 text-primary"
+                fill="none"
+                stroke="currentColor"
+                strokeWidth="2"
+              >
+                <circle cx="12" cy="12" r="10" />
+                <path d="M12 6v6l4 2" />
+              </svg>
+            </div>
+            <div>
+              <h1 className="text-base font-semibold">Together AI Demo</h1>
+              <p className="text-xs text-muted-foreground">
+                Open-source models via Copilot SDK
+              </p>
+            </div>
+          </div>
+        </div>
+
+        {/* Model Selection */}
+        <div className="p-5 border-b border-border">
+          <label className="text-xs font-medium text-muted-foreground uppercase tracking-wider mb-3 block">
+            Model
+          </label>
+          <div className="relative">
+            <select
+              value={selectedModel}
+              onChange={(e) => setSelectedModel(e.target.value)}
+              className="w-full h-9 px-3 pr-8 text-sm font-mono bg-background border border-border rounded-md appearance-none cursor-pointer focus:outline-none focus:ring-2 focus:ring-ring"
+            >
+              {MODEL_GROUPS.map((group) => (
+                <optgroup key={group.provider} label={group.provider}>
+                  {group.models.map((model) => (
+                    <option key={model.id} value={model.id}>
+                      {model.name}
+                    </option>
+                  ))}
+                </optgroup>
+              ))}
+            </select>
+            <ChevronDown className="absolute right-2.5 top-1/2 -translate-y-1/2 h-4 w-4 text-muted-foreground pointer-events-none" />
+          </div>
+          {selectedModelInfo && (
+            <p className="mt-2 text-xs text-muted-foreground font-mono truncate">
+              {selectedModelInfo.id}
+            </p>
+          )}
+        </div>
+
+        {/* Setup Guide */}
+        <div className="p-5 flex-1">
+          <label className="text-xs font-medium text-muted-foreground uppercase tracking-wider mb-3 block">
+            Setup
+          </label>
+
+          <div className="space-y-4">
+            {/* Step 1 */}
+            <div className="flex gap-3">
+              <div className="flex-none w-5 h-5 rounded-full bg-primary/20 text-primary text-xs flex items-center justify-center font-medium">
+                1
+              </div>
+              <div className="flex-1 min-w-0">
+                <p className="text-sm text-foreground">Get your API key</p>
+                <a
+                  href="https://api.together.xyz/settings/api-keys"
+                  target="_blank"
+                  rel="noopener noreferrer"
+                  className="inline-flex items-center gap-1 text-xs text-primary hover:underline mt-1"
+                >
+                  api.together.xyz/settings
+                  <ExternalLink className="h-3 w-3" />
+                </a>
+              </div>
+            </div>
+
+            {/* Step 2 */}
+            <div className="flex gap-3">
+              <div className="flex-none w-5 h-5 rounded-full bg-primary/20 text-primary text-xs flex items-center justify-center font-medium">
+                2
+              </div>
+              <div className="flex-1 min-w-0">
+                <p className="text-sm text-foreground">Add to .env.local</p>
+                <div className="mt-2 relative">
+                  <div className="flex items-center gap-2 px-3 py-2 bg-muted rounded-md font-mono text-xs text-muted-foreground">
+                    <Terminal className="h-3 w-3 flex-none" />
+                    <code className="truncate">TOGETHER_API_KEY=...</code>
+                    <button
+                      onClick={handleCopy}
+                      className="flex-none ml-auto text-muted-foreground hover:text-foreground transition-colors"
+                    >
+                      {copied ? (
+                        <Check className="h-3.5 w-3.5 text-emerald-500" />
+                      ) : (
+                        <Copy className="h-3.5 w-3.5" />
+                      )}
+                    </button>
+                  </div>
+                </div>
+              </div>
+            </div>
+
+            {/* Step 3 */}
+            <div className="flex gap-3">
+              <div className="flex-none w-5 h-5 rounded-full bg-primary/20 text-primary text-xs flex items-center justify-center font-medium">
+                3
+              </div>
+              <div className="flex-1 min-w-0">
+                <p className="text-sm text-foreground">
+                  Restart the dev server
+                </p>
+                <div className="mt-2">
+                  <div className="flex items-center gap-2 px-3 py-2 bg-muted rounded-md font-mono text-xs text-muted-foreground">
+                    <Terminal className="h-3 w-3 flex-none" />
+                    <code>pnpm dev</code>
+                  </div>
+                </div>
+              </div>
+            </div>
+          </div>
+        </div>
+
+        {/* Footer Links */}
+        <div className="p-5 border-t border-border space-y-2">
+          <a
+            href="https://api.together.xyz/models"
+            target="_blank"
+            rel="noopener noreferrer"
+            className="flex items-center gap-2 text-sm text-muted-foreground hover:text-foreground transition-colors"
+          >
+            <ExternalLink className="h-4 w-4" />
+            Explore all models
+          </a>
+          <a
+            href="https://github.com/YourGPT/copilot-sdk"
+            target="_blank"
+            rel="noopener noreferrer"
+            className="flex items-center gap-2 text-sm text-muted-foreground hover:text-foreground transition-colors"
+          >
+            <Github className="h-4 w-4" />
+            View on GitHub
+          </a>
+        </div>
+      </aside>
+
+      {/* Right Side - Chat */}
+      <main className="flex-1 min-w-0">
+        <CopilotProvider
+          key={selectedModel}
+          runtimeUrl={runtimeUrl}
+          maxIterations={5}
+        >
+          <CopilotChat className="h-full" />
+        </CopilotProvider>
+      </main>
+    </div>
+  );
+}
diff --git a/examples/togetherai-demo/lib/models.ts b/examples/togetherai-demo/lib/models.ts
new file mode 100644
index 0000000..3ab9c69
--- /dev/null
+++ b/examples/togetherai-demo/lib/models.ts
@@ -0,0 +1,113 @@
+/**
+ * Together AI Model Definitions
+ *
+ * Models verified from Together AI API (April 2026)
+ * @see https://api.together.xyz/models
+ */
+
+export interface ModelOption {
+  id: string;
+  name: string;
+  provider: string;
+  contextWindow: number;
+}
+
+export interface ModelGroup {
+  provider: string;
+  models: ModelOption[];
+}
+
+export const MODEL_GROUPS: ModelGroup[] = [
+  {
+    provider: "DeepSeek",
+    models: [
+      {
+        id: "deepseek-ai/DeepSeek-V3.1",
+        name: "DeepSeek V3.1",
+        provider: "DeepSeek",
+        contextWindow: 128000,
+      },
+      {
+        id: "deepseek-ai/DeepSeek-V3",
+        name: "DeepSeek V3",
+        provider: "DeepSeek",
+        contextWindow: 128000,
+      },
+      {
+        id: "deepseek-ai/DeepSeek-R1",
+        name: "DeepSeek R1",
+        provider: "DeepSeek",
+        contextWindow: 128000,
+      },
+    ],
+  },
+  {
+    provider: "Meta (Llama)",
+    models: [
+      {
+        id: "meta-llama/Llama-3.3-70B-Instruct-Turbo",
+        name: "Llama 3.3 70B Turbo",
+        provider: "Meta",
+        contextWindow: 131072,
+      },
+    ],
+  },
+  {
+    provider: "Qwen",
+    models: [
+      {
+        id: "Qwen/Qwen3.5-397B-A17B",
+        name: "Qwen 3.5 397B",
+        provider: "Qwen",
+        contextWindow: 262144,
+      },
+      {
+        id: "Qwen/Qwen3.5-9B",
+        name: "Qwen 3.5 9B",
+        provider: "Qwen",
+        contextWindow: 131072,
+      },
+    ],
+  },
+  {
+    provider: "Other",
+    models: [
+      {
+        id: "openai/gpt-oss-120b",
+        name: "GPT OSS 120B",
+        provider: "OpenAI",
+        contextWindow: 131072,
+      },
+      {
+        id: "moonshotai/Kimi-K2.5",
+        name: "Kimi K2.5",
+        provider: "Moonshot",
+        contextWindow: 262144,
+      },
+      {
+        id: "zai-org/GLM-5.1",
+        name: "GLM-5.1",
+        provider: "ZAI",
+        contextWindow: 202000,
+      },
+      {
+        id: "google/gemma-4-31B-it",
+        name: "Gemma 4 31B",
+        provider: "Google",
+        contextWindow: 131072,
+      },
+      {
+        id: "MiniMaxAI/MiniMax-M2.5",
+        name: "MiniMax M2.5",
+        provider: "MiniMax",
+        contextWindow: 131072,
+      },
+    ],
+  },
+];
+
+// Flatten all models
+export const ALL_MODELS: ModelOption[] = MODEL_GROUPS.flatMap((g) => g.models);
+
+// Default model
+export const DEFAULT_MODEL = "meta-llama/Llama-3.3-70B-Instruct-Turbo";
diff --git a/examples/togetherai-demo/lib/utils.ts b/examples/togetherai-demo/lib/utils.ts
new file mode 100644
index 0000000..a5ef193
--- /dev/null
+++ b/examples/togetherai-demo/lib/utils.ts
@@ -0,0 +1,6 @@
+import { clsx, type ClassValue } from "clsx";
+import { twMerge } from "tailwind-merge";
+
+export function cn(...inputs: ClassValue[]) {
+  return twMerge(clsx(inputs));
+}
diff --git a/examples/togetherai-demo/next-env.d.ts b/examples/togetherai-demo/next-env.d.ts
new file mode 100644
index 0000000..c4b7818
--- /dev/null
+++ b/examples/togetherai-demo/next-env.d.ts
@@ -0,0 +1,6 @@
+/// <reference types="next" />
+/// <reference types="next/image-types/global" />
+import "./.next/dev/types/routes.d.ts";
+
+// NOTE: This file should not be edited
+// see https://nextjs.org/docs/app/api-reference/config/typescript for more information.
diff --git a/examples/togetherai-demo/next.config.ts b/examples/togetherai-demo/next.config.ts
new file mode 100644
index 0000000..e9ffa30
--- /dev/null
+++ b/examples/togetherai-demo/next.config.ts
@@ -0,0 +1,7 @@
+import type { NextConfig } from "next";
+
+const nextConfig: NextConfig = {
+  /* config options here */
+};
+
+export default nextConfig;
diff --git a/examples/togetherai-demo/package.json b/examples/togetherai-demo/package.json
new file mode 100644
index 0000000..a844ffa
--- /dev/null
+++ b/examples/togetherai-demo/package.json
@@ -0,0 +1,34 @@
+{
+  "name": "togetherai-demo",
+  "version": "0.1.0",
+  "private": true,
+  "scripts": {
+    "dev": "next dev --turbopack -p 3035",
+    "build": "next build",
+    "start": "next start",
+    "lint": "eslint"
+  },
+  "dependencies": {
+    "@yourgpt/copilot-sdk": "workspace:*",
+    "@yourgpt/llm-sdk": "workspace:*",
+    "clsx": "^2.1.1",
+    "lucide-react": "^0.563.0",
+    "next": "16.1.5",
+    "openai": "^6.16.0",
+    "react": "19.2.3",
+    "react-dom": "19.2.3",
+    "tailwind-merge": "^3.4.0",
+    "zod": "^3.23.0"
+  },
+  "devDependencies": {
+    "@tailwindcss/postcss": "^4",
+    "@types/node": "^20",
+    "@types/react": "^19",
+    "@types/react-dom": "^19",
+    "eslint": "^9",
+    "eslint-config-next": "16.1.5",
+    "tailwindcss": "^4",
+    "tw-animate-css": "^1.4.0",
+    "typescript": "^5"
+  }
+}
diff --git a/examples/togetherai-demo/postcss.config.mjs b/examples/togetherai-demo/postcss.config.mjs
new file mode 100644
index 0000000..c2ddf74
--- /dev/null
+++ b/examples/togetherai-demo/postcss.config.mjs
@@ -0,0 +1,5 @@
+export default {
+  plugins: {
+    "@tailwindcss/postcss": {},
+  },
+};
diff --git a/examples/togetherai-demo/test-provider.ts b/examples/togetherai-demo/test-provider.ts
new file mode 100644
index 0000000..6f04912
--- /dev/null
+++ b/examples/togetherai-demo/test-provider.ts
@@ -0,0 +1,356 @@
+/**
+ * Together AI Provider — Comprehensive Test Suite
+ *
+ * Tests all major use cases:
+ *   1. generateText (non-streaming)
+ *   2. streamText (streaming)
+ *   3. Tool calling (single tool)
+ *   4. Multi-tool execution
+ *   5. Multi-step agentic loop (tool → follow-up)
+ *   6. System prompt + conversation history
+ *   7. JSON mode / structured output
+ *   8. Abort signal handling
+ *   9. Multiple models
+ *
+ * Run:  npx tsx test-provider.ts
+ */
+
+import "dotenv/config";
+import { generateText, streamText, tool } from "@yourgpt/llm-sdk";
+import { togetherai } from "@yourgpt/llm-sdk/togetherai";
+import { z } from "zod";
+
+// ── Config ────────────────────────────────────────────────────────────────────
+
+const API_KEY = process.env.TOGETHER_API_KEY;
+if (!API_KEY) {
+  console.error("❌ Set TOGETHER_API_KEY in .env");
+  process.exit(1);
+}
+
+// Default model for most tests (fast, tool-capable)
+const DEFAULT_MODEL = "meta-llama/Llama-3.3-70B-Instruct-Turbo";
+
+const passed: string[] = [];
+const failed: string[] = [];
+
+async function runTest(name: string, fn: () => Promise<void>) {
+  process.stdout.write(`\n━━━ ${name} `);
+  process.stdout.write("━".repeat(Math.max(0, 60 - name.length)) + "\n");
+  try {
+    await fn();
+    passed.push(name);
+    console.log(`✅ PASSED`);
+  } catch (err: any) {
+    failed.push(name);
+    console.error(`❌ FAILED:`, err?.message ?? err);
+  }
+}
+
+// ── Shared Tools ──────────────────────────────────────────────────────────────
+
+const weatherTool = tool({
+  description: "Get current weather for a city",
+  parameters: z.object({
+    city: z.string().describe("City name"),
+  }),
+  execute: async ({ city }) => {
+    // Simulated weather data
+    const data: Record<string, { temp: number; condition: string }> = {
+      tokyo: { temp: 22, condition: "cloudy" },
+      miami: { temp: 32, condition: "sunny" },
+      london: { temp: 14, condition: "rainy" },
+      paris: { temp: 18, condition: "partly cloudy" },
+    };
+    const result = data[city.toLowerCase()] ?? {
+      temp: 20,
+      condition: "unknown",
+    };
+    console.log(`  [tool] getWeather("${city}") → ${JSON.stringify(result)}`);
+    return result;
+  },
+});
+
+const calculatorTool = tool({
+  description: "Perform a math calculation",
+  parameters: z.object({
+    expression: z.string().describe("Math expression to evaluate, e.g. 2+2"),
+  }),
+  execute: async ({ expression }) => {
+    // Safe eval for simple math
+    const result = Function(`"use strict"; return (${expression})`)();
+    console.log(`  [tool] calculator("${expression}") → ${result}`);
+    return { expression, result };
+  },
+});
+
+const searchTool = tool({
+  description: "Search for information on a topic",
+  parameters: z.object({
+    query: z.string().describe("Search query"),
+    maxResults: z.number().optional().describe("Max results to return"),
+  }),
+  execute: async ({ query, maxResults }) => {
+    const results = [
+      { title: `Result 1 for "${query}"`, snippet: "Lorem ipsum..." },
+      { title: `Result 2 for "${query}"`, snippet: "Dolor sit amet..." },
+    ].slice(0, maxResults ?? 2);
+    console.log(`  [tool] search("${query}") → ${results.length} results`);
+    return results;
+  },
+});
+
+// ── Tests ─────────────────────────────────────────────────────────────────────
+
+async function main() {
+  console.log("🔬 Together AI Provider — Comprehensive Tests");
+  console.log(`   Model: ${DEFAULT_MODEL}`);
+  console.log(`   API Key: ${API_KEY!.slice(0, 12)}...${API_KEY!.slice(-4)}`);
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 1. generateText — basic non-streaming
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("1. generateText (non-streaming)", async () => {
+    const result = await generateText({
+      model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+      prompt: "What is 2 + 2? Reply with just the number.",
+    });
+
+    console.log(`  Text: "${result.text.trim()}"`);
+    console.log(`  Finish: ${result.finishReason}`);
+    console.log(
+      `  Usage: ${result.usage.promptTokens}p / ${result.usage.completionTokens}c / ${result.usage.totalTokens}t`,
+    );
+
+    if (!result.text) throw new Error("Empty response");
+    if (result.finishReason !== "stop")
+      throw new Error(`Unexpected finish: ${result.finishReason}`);
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 2. streamText — streaming response
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("2. streamText (streaming)", async () => {
+    const result = await streamText({
+      model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+      prompt: "Count from 1 to 5, one number per line.",
+    });
+
+    process.stdout.write("  Stream: ");
+    let chunks = 0;
+    for await (const chunk of result.textStream) {
+      process.stdout.write(chunk);
+      chunks++;
+    }
+    console.log();
+
+    const text = await result.text;
+    console.log(`  Chunks received: ${chunks}`);
+    console.log(`  Full text length: ${text.length}`);
+
+    if (chunks < 2) throw new Error("Too few chunks — streaming may not work");
+    if (!text) throw new Error("Empty streamed text");
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 3. generateText with single tool
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("3. generateText + single tool", async () => {
+    const result = await generateText({
+      model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+      prompt: "What is the weather in Tokyo?",
+      tools: { getWeather: weatherTool },
+      maxSteps: 3,
+    });
+
+    console.log(`  Text: "${result.text.slice(0, 120)}..."`);
+    console.log(`  Tool calls: ${result.toolCalls.length}`);
+    console.log(`  Tool results: ${result.toolResults.length}`);
+    console.log(`  Steps: ${result.steps.length}`);
+
+    if (result.toolCalls.length === 0) throw new Error("No tool calls made");
+    if (result.toolResults.length === 0) throw new Error("No tool results");
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 4. generateText with multiple tools
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("4. generateText + multiple tools", async () => {
+    const result = await generateText({
+      model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+      prompt: "What is the weather in Miami? Also calculate 15 * 37 for me.",
+      tools: {
+        getWeather: weatherTool,
+        calculator: calculatorTool,
+      },
+      maxSteps: 5,
+    });
+
+    console.log(`  Text: "${result.text.slice(0, 150)}..."`);
+    console.log(`  Tool calls: ${result.toolCalls.length}`);
+
+    const toolNames = result.toolCalls.map((tc) => tc.name);
+    console.log(`  Tools used: ${toolNames.join(", ")}`);
+    console.log(`  Steps: ${result.steps.length}`);
+
+    if (result.toolCalls.length === 0) throw new Error("No tool calls made");
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 5. streamText with tools (agentic streaming)
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("5. streamText + tool calling", async () => {
+    const result = await streamText({
+      model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+      prompt: "What's the weather like in London right now?",
+      tools: { getWeather: weatherTool },
+      maxSteps: 3,
+    });
+
+    process.stdout.write("  Stream: ");
+    for await (const chunk of result.textStream) {
+      process.stdout.write(chunk);
+    }
+    console.log();
+
+    const text = await result.text;
+    console.log(`  Final text length: ${text.length}`);
+
+    if (!text) throw new Error("Empty streamed text after tool use");
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 6. System prompt + conversation history
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("6. System prompt + multi-turn", async () => {
+    const result = await generateText({
+      model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+      system:
+        "You are a pirate. Always respond in pirate speak. Keep it under 30 words.",
+      messages: [
+        { role: "user", content: "Hello, who are you?" },
+        {
+          role: "assistant",
+          content: "Ahoy matey! I be a salty sea dog, sailing the seven seas!",
+        },
+        { role: "user", content: "Where is your treasure?" },
+      ],
+    });
+
+    console.log(`  Text: "${result.text.trim()}"`);
+
+    if (!result.text) throw new Error("Empty response");
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 7. JSON mode (structured output)
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("7. JSON mode", async () => {
+    const result = await generateText({
+      model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+      prompt:
+        'Return a JSON object with keys "name", "age", and "city" for a fictional character. Respond with only valid JSON, no markdown.',
+    });
+
+    console.log(`  Raw: "${result.text.trim().slice(0, 200)}"`);
+
+    // Try to parse it
+    const parsed = JSON.parse(result.text.trim());
+    console.log(`  Parsed: ${JSON.stringify(parsed)}`);
+
+    if (!parsed.name || !parsed.age || !parsed.city) {
+      throw new Error("Missing expected JSON keys");
+    }
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 8. Abort signal
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("8. Abort signal", async () => {
+    const controller = new AbortController();
+
+    // Abort after 500ms
+    setTimeout(() => controller.abort(), 500);
+
+    try {
+      const result = await streamText({
+        model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+        prompt: "Write a very long essay about the history of computing.",
+        signal: controller.signal,
+      });
+
+      let chars = 0;
+      for await (const chunk of result.textStream) {
+        chars += chunk.length;
+      }
+
+      // If we get here without error, the stream completed before abort
+      console.log(`  Stream completed before abort (${chars} chars)`);
+    } catch (err: any) {
+      if (
+        err.message?.includes("abort") ||
+        err.message?.includes("Abort") ||
+        err.name === "AbortError"
+      ) {
+        console.log("  Abort caught correctly");
+      } else {
+        throw err;
+      }
+    }
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 9. Multiple models
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("9. Multiple models", async () => {
+    const models = [
+      "meta-llama/Llama-3.3-70B-Instruct-Turbo",
+      "deepseek-ai/DeepSeek-V3",
+    ];
+
+    for (const modelId of models) {
+      const result = await generateText({
+        model: togetherai(modelId, { apiKey: API_KEY }),
+        prompt: "Say hello in one sentence.",
+      });
+
+      console.log(
+        `  ${modelId.split("/").pop()}: "${result.text.trim().slice(0, 80)}"`,
+      );
+
+      if (!result.text) throw new Error(`Empty response from ${modelId}`);
+    }
+  });
+
+  // ──────────────────────────────────────────────────────────────────────────
+  // 10. Long context / large prompt
+  // ──────────────────────────────────────────────────────────────────────────
+  await runTest("10. Token usage tracking", async () => {
+    const result = await generateText({
+      model: togetherai(DEFAULT_MODEL, { apiKey: API_KEY }),
+      prompt: "Write exactly three sentences about the ocean.",
+      maxTokens: 150,
+    });
+
+    console.log(`  Text: "${result.text.trim().slice(0, 120)}..."`);
+    console.log(
+      `  Usage: prompt=${result.usage.promptTokens} completion=${result.usage.completionTokens} total=${result.usage.totalTokens}`,
+    );
+
+    if (result.usage.promptTokens === 0) throw new Error("promptTokens is 0");
+    if (result.usage.completionTokens === 0)
+      throw new Error("completionTokens is 0");
+  });
+
+  // ── Summary ─────────────────────────────────────────────────────────────
+  console.log("\n" + "═".repeat(64));
+  console.log(`  ✅ Passed: ${passed.length}  ❌ Failed: ${failed.length}`);
+  if (failed.length > 0) {
+    console.log(`  Failed tests: ${failed.join(", ")}`);
+  }
+  console.log("═".repeat(64) + "\n");
+
+  process.exit(failed.length > 0 ? 1 : 0);
+}
+
+main();
diff --git a/examples/togetherai-demo/tsconfig.json b/examples/togetherai-demo/tsconfig.json
new file mode 100644
index 0000000..51b43cd
--- /dev/null
+++ b/examples/togetherai-demo/tsconfig.json
@@ -0,0 +1,34 @@
+{
+  "compilerOptions": {
+    "target": "ES2017",
+    "lib": ["dom", "dom.iterable", "esnext"],
+    "allowJs": true,
+    "skipLibCheck": true,
+    "strict": true,
+    "noEmit": true,
+    "esModuleInterop": true,
+    "module": "esnext",
+    "moduleResolution": "bundler",
+    "resolveJsonModule": true,
+    "isolatedModules": true,
+    "jsx": "react-jsx",
+    "incremental": true,
+    "plugins": [
+      {
+        "name": "next"
+      }
+    ],
+    "paths": {
+      "@/*": ["./*"]
+    }
+  },
+  "include": [
+    "next-env.d.ts",
+    "**/*.ts",
+    "**/*.tsx",
+    ".next/types/**/*.ts",
+    ".next/dev/types/**/*.ts",
+    "**/*.mts"
+  ],
+  "exclude": ["node_modules", "test-*.ts"]
+}
diff --git a/packages/llm-sdk/package.json b/packages/llm-sdk/package.json
index faeebcb..534a840 100644
--- a/packages/llm-sdk/package.json
+++ b/packages/llm-sdk/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@yourgpt/llm-sdk",
-  "version": "2.1.7",
+  "version": "2.1.8",
   "description": "AI SDK for building AI Agents with any LLM",
   "main": "./dist/index.js",
   "module": "./dist/index.mjs",
@@ -51,6 +51,11 @@
       "import": "./dist/providers/fireworks/index.mjs",
       "require": "./dist/providers/fireworks/index.js"
     },
+    "./togetherai": {
+      "types": "./dist/providers/togetherai/index.d.mts",
+      "import": "./dist/providers/togetherai/index.mjs",
+      "require": "./dist/providers/togetherai/index.js"
+    },
     "./adapters": {
       "types": "./dist/adapters/index.d.ts",
       "import": "./dist/adapters/index.mjs",
@@ -126,6 +131,8 @@
     "ollama",
     "openrouter",
     "fireworks",
+    "togetherai",
+    "together-ai",
     "multi-provider",
     "streaming"
   ],
diff --git a/packages/llm-sdk/src/adapters/azure.ts b/packages/llm-sdk/src/adapters/azure.ts
index ea7f1db..c0a7d53 100644
--- a/packages/llm-sdk/src/adapters/azure.ts
+++ b/packages/llm-sdk/src/adapters/azure.ts
@@ -225,6 +225,11 @@ export class AzureAdapter implements LLMAdapter {
                   id: currentToolCall.id,
                   args: currentToolCall.arguments,
                 };
+                yield {
+                  type: "action:end",
+                  id: currentToolCall.id,
+                  name: currentToolCall.name,
+                };
               }
 
               currentToolCall = {
@@ -254,6 +259,12 @@ export class AzureAdapter implements LLMAdapter {
               id: currentToolCall.id,
               args: currentToolCall.arguments,
             };
+            yield {
+              type: "action:end",
+              id: currentToolCall.id,
+              name: currentToolCall.name,
+            };
+            currentToolCall = null;
           }
         }
       }
diff --git a/packages/llm-sdk/src/adapters/google.ts b/packages/llm-sdk/src/adapters/google.ts
index 520d43e..83d44e1 100644
--- a/packages/llm-sdk/src/adapters/google.ts
+++ b/packages/llm-sdk/src/adapters/google.ts
@@ -447,6 +447,11 @@ export class GoogleAdapter implements LLMAdapter {
                 id: currentToolCall.id,
                 args: JSON.stringify(currentToolCall.args),
               };
+              yield {
+                type: "action:end",
+                id: currentToolCall.id,
+                name: currentToolCall.name,
+              };
             }
 
             currentToolCall = {
@@ -472,6 +477,12 @@ export class GoogleAdapter implements LLMAdapter {
               id: currentToolCall.id,
               args: JSON.stringify(currentToolCall.args),
             };
+            yield {
+              type: "action:end",
+              id: currentToolCall.id,
+              name: currentToolCall.name,
+            };
+            currentToolCall = null;
           }
         }
 
diff --git a/packages/llm-sdk/src/adapters/ollama.ts b/packages/llm-sdk/src/adapters/ollama.ts
index a06dad3..866a82c 100644
--- a/packages/llm-sdk/src/adapters/ollama.ts
+++ b/packages/llm-sdk/src/adapters/ollama.ts
@@ -371,6 +371,13 @@ export class OllamaAdapter implements LLMAdapter {
                   id: toolCallId,
                   args: JSON.stringify(toolCall.function?.arguments || {}),
                 };
+
+                // Emit action end to trigger server-side tool execution in runtime
+                yield {
+                  type: "action:end",
+                  id: toolCallId,
+                  name: toolCall.function?.name || "",
+                };
               }
               hasEmittedToolCalls = true;
             }
diff --git a/packages/llm-sdk/src/adapters/openai.ts b/packages/llm-sdk/src/adapters/openai.ts
index d134de8..d0795ec 100644
--- a/packages/llm-sdk/src/adapters/openai.ts
+++ b/packages/llm-sdk/src/adapters/openai.ts
@@ -520,6 +520,11 @@ export class OpenAIAdapter implements LLMAdapter {
                   id: currentToolCall.id,
                   args: currentToolCall.arguments,
                 };
+                yield {
+                  type: "action:end",
+                  id: currentToolCall.id,
+                  name: currentToolCall.name,
+                };
               }
 
               const tcExtraContent = (toolCall as any).extra_content as
@@ -566,6 +571,12 @@ export class OpenAIAdapter implements LLMAdapter {
               id: currentToolCall.id,
               args: currentToolCall.arguments,
             };
+            yield {
+              type: "action:end",
+              id: currentToolCall.id,
+              name: currentToolCall.name,
+            };
+            currentToolCall = null;
           }
         }
       }
diff --git a/packages/llm-sdk/src/providers/togetherai/index.ts b/packages/llm-sdk/src/providers/togetherai/index.ts
new file mode 100644
index 0000000..d8e3588
--- /dev/null
+++ b/packages/llm-sdk/src/providers/togetherai/index.ts
@@ -0,0 +1,126 @@
+/**
+ * Together AI Provider
+ *
+ * Together AI is a high-performance inference platform for open-source models
+ * (Llama, DeepSeek, Qwen, Mistral, Gemma, and more).
+ *
+ * Uses an OpenAI-compatible API — set TOGETHER_API_KEY in your environment.
+ *
+ * @see https://docs.together.ai/reference
+ *
+ * @example
+ * ```ts
+ * // Modern pattern — returns LanguageModel directly
+ * import { togetherai } from '@yourgpt/llm-sdk/togetherai';
+ * import { generateText } from '@yourgpt/llm-sdk';
+ *
+ * const result = await generateText({
+ *   model: togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo'),
+ *   prompt: 'Hello!',
+ * });
+ *
+ * // Legacy pattern — returns AIProvider for createRuntime
+ * import { createTogetherAI } from '@yourgpt/llm-sdk/togetherai';
+ * import { createRuntime } from '@yourgpt/llm-sdk';
+ *
+ * const provider = createTogetherAI({ apiKey: '...' });
+ * const runtime = createRuntime({ provider, model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo' });
+ * ```
+ */
+
+// Modern pattern - togetherai() function returning LanguageModel
+export { togetherai } from "./provider";
+export type { TogetherAIProviderOptions } from "./provider";
+
+import { createOpenAIAdapter } from "../../adapters/openai";
+import {
+  createCallableProvider,
+  type AIProvider,
+  type ProviderCapabilities,
+} from "../types";
+
+// ============================================
+// Provider Config
+// ============================================
+
+export interface TogetherAIProviderConfig {
+  /** API key (defaults to TOGETHER_API_KEY env var) */
+  apiKey?: string;
+  /** Base URL for API */
+  baseUrl?: string;
+}
+
+// ============================================
+// Default capabilities
+// ============================================
+
+const DEFAULT_CAPABILITIES = {
+  vision: true,
+  tools: true,
+  jsonMode: true,
+  maxTokens: 131072,
+};
+
+// ============================================
+// Provider Factory (Legacy pattern — for createRuntime)
+// ============================================
+
+/**
+ * Create a Together AI provider (callable, for use with createRuntime)
+ *
+ * @example
+ * ```typescript
+ * import { createTogetherAI } from '@yourgpt/llm-sdk/togetherai';
+ * import { createRuntime } from '@yourgpt/llm-sdk';
+ *
+ * const together = createTogetherAI({ apiKey: '...' });
+ * const runtime = createRuntime({
+ *   provider: together,
+ *   model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
+ * });
+ *
+ * // Handle incoming chat requests
+ * return runtime.handleRequest(request);
+ * ```
+ */
+export function createTogetherAI(
+  config: TogetherAIProviderConfig = {},
+): AIProvider {
+  const apiKey = config.apiKey ?? process.env.TOGETHER_API_KEY ?? "";
+  const baseUrl = config.baseUrl ?? "https://api.together.xyz/v1";
+
+  const providerFn = (modelId: string) => {
+    return createOpenAIAdapter({
+      apiKey,
+      model: modelId,
+      baseUrl,
+    });
+  };
+
+  const getCapabilities = (_modelId: string): ProviderCapabilities => {
+    return {
+      supportsVision: DEFAULT_CAPABILITIES.vision,
+      supportsTools: DEFAULT_CAPABILITIES.tools,
+      supportsThinking: false,
+      supportsStreaming: true,
+      supportsPDF: false,
+      supportsAudio: false,
+      supportsVideo: false,
+      maxTokens: DEFAULT_CAPABILITIES.maxTokens,
+      supportedImageTypes: DEFAULT_CAPABILITIES.vision
+        ? ["image/png", "image/jpeg", "image/gif", "image/webp"]
+        : [],
+      supportsJsonMode: DEFAULT_CAPABILITIES.jsonMode,
+      supportsSystemMessages: true,
+    };
+  };
+
+  return createCallableProvider(providerFn, {
+    name: "togetherai",
+    supportedModels: [],
+    getCapabilities,
+  });
+}
+
+// Alias
+export const createTogetherAIProvider = createTogetherAI;
diff --git a/packages/llm-sdk/src/providers/togetherai/provider.ts b/packages/llm-sdk/src/providers/togetherai/provider.ts
new file mode 100644
index 0000000..7c1c718
--- /dev/null
+++ b/packages/llm-sdk/src/providers/togetherai/provider.ts
@@ -0,0 +1,316 @@
+/**
+ * Together AI Provider
+ *
+ * Together AI is a high-performance inference platform for open-source models
+ * (Llama, DeepSeek, Qwen, Mistral, Gemma, and more).
+ *
+ * It uses an OpenAI-compatible REST API.
+ *
+ * @see https://docs.together.ai/reference
+ *
+ * @example
+ * ```ts
+ * import { togetherai } from '@yourgpt/llm-sdk/togetherai';
+ * import { generateText } from '@yourgpt/llm-sdk';
+ *
+ * const result = await generateText({
+ *   model: togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo'),
+ *   prompt: 'Hello!',
+ * });
+ * ```
+ */
+
+import type {
+  LanguageModel,
+  DoGenerateParams,
+  DoGenerateResult,
+  StreamChunk,
+  ToolCall,
+  FinishReason,
+  CoreMessage,
+} from "../../core/types";
+
+// ============================================
+// Provider Options
+// ============================================
+
+export interface TogetherAIProviderOptions {
+  /** API key (defaults to TOGETHER_API_KEY env var) */
+  apiKey?: string;
+  /** Base URL for API (defaults to https://api.together.xyz/v1) */
+  baseURL?: string;
+}
+
+// ============================================
+// Provider Implementation
+// ============================================
+
+/**
+ * Create a Together AI language model.
+ *
+ * Model IDs follow the format `org/model-name` (e.g. 'meta-llama/Llama-3.3-70B-Instruct-Turbo').
+ *
+ * @param modelId - Full model ID (e.g. 'meta-llama/Llama-3.3-70B-Instruct-Turbo')
+ * @param options - Provider options
+ * @returns LanguageModel instance
+ *
+ * @example
+ * ```ts
+ * const model = togetherai('meta-llama/Llama-3.3-70B-Instruct-Turbo');
+ *
+ * // With explicit API key
+ * const model = togetherai('deepseek-ai/DeepSeek-V3', {
+ *   apiKey: 'your-key',
+ * });
+ * ```
+ */
+export function togetherai(
+  modelId: string,
+  options: TogetherAIProviderOptions = {},
+): LanguageModel {
+  const apiKey = options.apiKey ?? process.env.TOGETHER_API_KEY;
+  const baseURL = options.baseURL ?? "https://api.together.xyz/v1";
+
+  // Lazy-load OpenAI client (Together AI uses OpenAI-compatible API)
+  let client: any = null;
+  async function getClient(): Promise<any> {
+    if (!client) {
+      const { default: OpenAI } = await import("openai");
+      client = new OpenAI({ apiKey, baseURL });
+    }
+    return client;
+  }
+
+  return {
+    provider: "togetherai",
+    modelId,
+
+    capabilities: {
+      supportsVision: true,
+      supportsTools: true,
+      supportsStreaming: true,
+      supportsJsonMode: true,
+      supportsThinking: false,
+      supportsPDF: false,
+      maxTokens: 131072,
+      supportedImageTypes: [
+        "image/png",
+        "image/jpeg",
+        "image/gif",
+        "image/webp",
+      ],
+    },
+
+    async doGenerate(params: DoGenerateParams): Promise<DoGenerateResult> {
+      const client = await getClient();
+      const messages = formatMessages(params.messages);
+
+      const requestBody: any = {
+        model: modelId,
+        messages,
+        temperature: params.temperature,
+        max_tokens: params.maxTokens,
+      };
+
+      if (params.tools) {
+        requestBody.tools = params.tools;
+      }
+
+      const response = await client.chat.completions.create(requestBody);
+      const choice = response.choices[0];
+      const message = choice.message;
+
+      const toolCalls: ToolCall[] = (message.tool_calls ?? []).map(
+        (tc: any) => ({
+          id: tc.id,
+          name: tc.function.name,
+          args: JSON.parse(tc.function.arguments || "{}"),
+        }),
+      );
+
+      return {
+        text: message.content ?? "",
+        toolCalls,
+        finishReason: mapFinishReason(choice.finish_reason),
+        usage: {
+          promptTokens: response.usage?.prompt_tokens ?? 0,
+          completionTokens: response.usage?.completion_tokens ?? 0,
+          totalTokens: response.usage?.total_tokens ?? 0,
+        },
+        rawResponse: response,
+      };
+    },
+
+    async *doStream(params: DoGenerateParams): AsyncGenerator<StreamChunk> {
+      const client = await getClient();
+      const messages = formatMessages(params.messages);
+
+      const requestBody: any = {
+        model: modelId,
+        messages,
+        temperature: params.temperature,
+        max_tokens: params.maxTokens,
+        stream: true,
+      };
+
+      if (params.tools) {
+        requestBody.tools = params.tools;
+      }
+
+      const stream = await client.chat.completions.create(requestBody);
+
+      // Track tool calls by index
+      const toolCallMap = new Map<
+        number,
+        { id: string; name: string; arguments: string }
+      >();
+
+      let totalPromptTokens = 0;
+      let totalCompletionTokens = 0;
+
+      for await (const chunk of stream) {
+        if (params.signal?.aborted) {
+          yield { type: "error", error: new Error("Aborted") };
+          return;
+        }
+
+        const choice = chunk.choices[0];
+        const delta = choice?.delta;
+
+        if (delta?.content) {
+          yield { type: "text-delta", text: delta.content };
+        }
+
+        if (delta?.tool_calls) {
+          for (const tc of delta.tool_calls) {
+            const idx = tc.index ?? 0;
+            if (!toolCallMap.has(idx)) {
+              toolCallMap.set(idx, {
+                id: tc.id ?? "",
+                name: tc.function?.name ?? "",
+                arguments: tc.function?.arguments ?? "",
+              });
+            } else {
+              const existing = toolCallMap.get(idx)!;
+              if (tc.id && !existing.id) existing.id = tc.id;
+              if (tc.function?.name && !existing.name)
+                existing.name = tc.function.name;
+              if (tc.function?.arguments)
+                existing.arguments += tc.function.arguments;
+            }
+          }
+        }
+
+        if (choice?.finish_reason) {
+          for (const [, tc] of toolCallMap) {
+            yield {
+              type: "tool-call",
+              toolCall: {
+                id: tc.id,
+                name: tc.name,
+                args: JSON.parse(tc.arguments || "{}"),
+              },
+            };
+          }
+          toolCallMap.clear();
+
+          if (chunk.usage) {
+            totalPromptTokens = chunk.usage.prompt_tokens;
+            totalCompletionTokens = chunk.usage.completion_tokens;
+          }
+
+          yield {
+            type: "finish",
+            finishReason: mapFinishReason(choice.finish_reason),
+            usage: {
+              promptTokens: totalPromptTokens,
+              completionTokens: totalCompletionTokens,
+              totalTokens: totalPromptTokens + totalCompletionTokens,
+            },
+          };
+        }
+      }
+    },
+  };
+}
+
+// ============================================
+// Helpers
+// ============================================
+
+function mapFinishReason(reason: string | null): FinishReason {
+  switch (reason) {
+    case "stop":
+      return "stop";
+    case "length":
+      return "length";
+    case "tool_calls":
+    case "function_call":
+      return "tool-calls";
+    case "content_filter":
+      return "content-filter";
+    default:
+      return "unknown";
+  }
+}
+
+function formatMessages(messages: CoreMessage[]): any[] {
+  return messages.map((msg) => {
+    switch (msg.role) {
+      case "system":
+        return { role: "system", content: msg.content };
+
+      case "user":
+        if (typeof msg.content === "string") {
+          return { role: "user", content: msg.content };
+        }
+        return {
+          role: "user",
+          content: msg.content.map((part) => {
+            if (part.type === "text") {
+              return { type: "text", text: part.text };
+            }
+            if (part.type === "image") {
+              const imageData =
+                typeof part.image === "string"
+                  ? part.image
+                  : Buffer.from(part.image).toString("base64");
+              const url = imageData.startsWith("data:")
+                ? imageData
+                : `data:${part.mimeType ?? "image/png"};base64,${imageData}`;
+              return { type: "image_url", image_url: { url, detail: "auto" } };
+            }
+            return { type: "text", text: "" };
+          }),
+        };
+
+      case "assistant": {
+        const assistantMsg: any = { role: "assistant", content: msg.content };
+        if (msg.toolCalls && msg.toolCalls.length > 0) {
+          assistantMsg.tool_calls = msg.toolCalls.map((tc) => ({
+            id: tc.id,
+            type: "function",
+            function: {
+              name: tc.name,
+              arguments: JSON.stringify(tc.args),
+            },
+          }));
+        }
+        return assistantMsg;
+      }
+
+      case "tool":
+        return {
+          role: "tool",
+          tool_call_id: msg.toolCallId,
+          content: msg.content,
+        };
+
+      default:
+        return msg;
+    }
+  });
+}
+
+// Alias for backward compatibility
+export { togetherai as createTogetherAI };
diff --git a/packages/llm-sdk/tsup.config.ts b/packages/llm-sdk/tsup.config.ts
index 2d407f2..a6465f0 100644
--- a/packages/llm-sdk/tsup.config.ts
+++ b/packages/llm-sdk/tsup.config.ts
@@ -14,6 +14,7 @@ export default defineConfig({
     "providers/azure/index": "src/providers/azure/index.ts",
     "providers/openrouter/index": "src/providers/openrouter/index.ts",
     "providers/fireworks/index": "src/providers/fireworks/index.ts",
+    "providers/togetherai/index": "src/providers/togetherai/index.ts",
 
     // Legacy adapters
     "adapters/index": "src/adapters/index.ts",
diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml
index 06e5207..12761d4 100644
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -906,11 +906,11 @@ importers:
         specifier: ^0.5.19
         version: 0.5.19(tailwindcss@4.1.18)
       '@yourgpt/copilot-sdk':
-        specifier: workspace:*
-        version: link:../../packages/copilot-sdk
+        specifier: ^2.1.8
+        version: 2.1.8(@types/react-dom@18.3.7(@types/react@18.3.27))(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)
       '@yourgpt/llm-sdk':
-        specifier: workspace:*
-        version: link:../../packages/llm-sdk
+        specifier: ^2.1.8
+        version: 2.1.8(@anthropic-ai/sdk@0.71.2(zod@3.25.76))(@google/generative-ai@0.24.1)(openai@6.16.0(ws@8.18.0)(zod@3.25.76))
       class-variance-authority:
         specifier: ^0.7.1
         version: 0.7.1
@@ -1331,6 +1331,67 @@ importers:
         specifier: ^5
         version: 5.9.3
 
+  examples/togetherai-demo:
+    dependencies:
+      '@yourgpt/copilot-sdk':
+        specifier: workspace:*
+        version: link:../../packages/copilot-sdk
+      '@yourgpt/llm-sdk':
+        specifier: workspace:*
+        version: link:../../packages/llm-sdk
+      clsx:
+        specifier: ^2.1.1
+        version: 2.1.1
+      lucide-react:
+        specifier: ^0.563.0
+        version: 0.563.0(react@19.2.3)
+      next:
+        specifier: 16.1.5
+        version: 16.1.5(@babel/core@7.28.5)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)(sass@1.97.0)
+      openai:
+        specifier: ^6.16.0
+        version: 6.16.0(ws@8.18.0)(zod@3.25.76)
+      react:
+        specifier: 19.2.3
+        version: 19.2.3
+      react-dom:
+        specifier: 19.2.3
+        version: 19.2.3(react@19.2.3)
+      tailwind-merge:
+        specifier: ^3.4.0
+        version: 3.4.0
+      zod:
+        specifier: ^3.23.0
+        version: 3.25.76
+    devDependencies:
+      '@tailwindcss/postcss':
+        specifier: ^4
+        version: 4.1.18
+      '@types/node':
+        specifier: ^20
+        version: 20.19.27
+      '@types/react':
+        specifier: ^18.2.0
+        version: 18.3.27
+      '@types/react-dom':
+        specifier: ^18.2.0
+        version: 18.3.7(@types/react@18.3.27)
+      eslint:
+        specifier: ^9
+        version: 9.39.2(jiti@2.6.1)
+      eslint-config-next:
+        specifier: 16.1.5
+        version: 16.1.5(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3)
+      tailwindcss:
+        specifier: ^4
+        version: 4.2.1
+      tw-animate-css:
+        specifier: ^1.4.0
+        version: 1.4.0
+      typescript:
+        specifier: ^5
+        version: 5.9.3
+
   examples/web-search-demo:
     dependencies:
       '@yourgpt/copilot-sdk':
@@ -4618,6 +4679,33 @@ packages:
       babel-plugin-react-compiler:
         optional: true
 
+  '@yourgpt/copilot-sdk@2.1.8':
+    resolution: {integrity: sha512-c3cSm92Liz7Jr0rbzJ5dPvf+N/fFpsLHdO6Ww1bzImpxNwXIq4zTbdDviCD/1ybMpAxrnaU5wUAv4F/rGYjyjg==}
+    engines: {node: '>=18'}
+    peerDependencies:
+      react: ^18.0.0 || ^19.0.0
+      react-dom: ^18.0.0 || ^19.0.0
+    peerDependenciesMeta:
+      react:
+        optional: true
+      react-dom:
+        optional: true
+
+  '@yourgpt/llm-sdk@2.1.8':
+    resolution: {integrity: sha512-dMLyvaEySmJC+6PnodZVE9N9l+A1aPmzOP9U1hk+on+mS1T0JCdXXh95wjHGL8/KNOy97clh9b4R78NnRFc0XQ==}
+    engines: {node: '>=18'}
+    peerDependencies:
+      '@anthropic-ai/sdk': '>=0.20.0'
+      '@google/generative-ai': '>=0.21.0'
+      openai: '>=4.0.0'
+    peerDependenciesMeta:
+      '@anthropic-ai/sdk':
+        optional: true
+      '@google/generative-ai':
+        optional: true
+      openai:
+        optional: true
+
   abort-controller@3.0.0:
     resolution: {integrity: sha512-h8lQ8tacZYnR3vNQTgibj+tODHI5/+l06Au2Pcriv/Gmet0eaj4TwWH41sO9wnHDiQsEj19q0drzdWdeAHtweg==}
     engines: {node: '>=6.5'}
@@ -8923,6 +9011,20 @@ snapshots:
     optionalDependencies:
       '@types/react': 18.3.27
 
+  '@base-ui/react@1.0.0(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)':
+    dependencies:
+      '@babel/runtime': 7.28.4
+      '@base-ui/utils': 0.2.3(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)
+      '@floating-ui/react-dom': 2.1.6(react-dom@19.2.3(react@19.2.3))(react@19.2.3)
+      '@floating-ui/utils': 0.2.10
+      react: 19.2.3
+      react-dom: 19.2.3(react@19.2.3)
+      reselect: 5.1.1
+      tabbable: 6.3.0
+      use-sync-external-store: 1.6.0(react@19.2.3)
+    optionalDependencies:
+      '@types/react': 18.3.27
+
   '@base-ui/utils@0.2.3(@types/react@18.3.27)(react-dom@18.3.1(react@18.3.1))(react@18.3.1)':
     dependencies:
       '@babel/runtime': 7.28.4
@@ -8934,6 +9036,17 @@ snapshots:
     optionalDependencies:
       '@types/react': 18.3.27
 
+  '@base-ui/utils@0.2.3(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)':
+    dependencies:
+      '@babel/runtime': 7.28.4
+      '@floating-ui/utils': 0.2.10
+      react: 19.2.3
+      react-dom: 19.2.3(react@19.2.3)
+      reselect: 5.1.1
+      use-sync-external-store: 1.6.0(react@19.2.3)
+    optionalDependencies:
+      '@types/react': 18.3.27
+
   '@changesets/apply-release-plan@7.0.14':
     dependencies:
       '@changesets/config': 3.1.2
@@ -11568,6 +11681,11 @@ snapshots:
       react: 18.3.1
       shiki: 3.20.0
 
+  '@streamdown/code@1.0.1(react@19.2.3)':
+    dependencies:
+      react: 19.2.3
+      shiki: 3.20.0
+
   '@swc/helpers@0.5.15':
     dependencies:
       tslib: 2.8.1
@@ -12073,6 +12191,40 @@ snapshots:
       '@rolldown/pluginutils': 1.0.0-rc.7
       vite: 8.0.3(@emnapi/core@1.7.1)(@emnapi/runtime@1.7.1)(@types/node@20.19.27)(esbuild@0.27.1)(jiti@2.6.1)(sass@1.97.0)(tsx@4.21.0)(yaml@2.8.2)
 
+  '@yourgpt/copilot-sdk@2.1.8(@types/react-dom@18.3.7(@types/react@18.3.27))(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)':
+    dependencies:
+      '@base-ui/react': 1.0.0(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)
+      '@radix-ui/react-avatar': 1.1.11(@types/react-dom@18.3.7(@types/react@18.3.27))(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)
+      '@radix-ui/react-hover-card': 1.1.15(@types/react-dom@18.3.7(@types/react@18.3.27))(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)
+      '@radix-ui/react-slot': 1.2.4(@types/react@18.3.27)(react@19.2.3)
+      '@radix-ui/react-tooltip': 1.2.8(@types/react-dom@18.3.7(@types/react@18.3.27))(@types/react@18.3.27)(react-dom@19.2.3(react@19.2.3))(react@19.2.3)
+      '@streamdown/code': 1.0.1(react@19.2.3)
+      class-variance-authority: 0.7.1
+      clsx: 2.1.1
+      html-to-image: 1.11.13
+      html2canvas: 1.4.1
+      lucide-react: 0.561.0(react@19.2.3)
+      streamdown: 2.1.0(react@19.2.3)
+      tailwind-merge: 3.4.0
+      use-stick-to-bottom: 1.1.1(react@19.2.3)
+      zod: 3.25.76
+    optionalDependencies:
+      react: 19.2.3
+      react-dom: 19.2.3(react@19.2.3)
+    transitivePeerDependencies:
+      - '@types/react'
+      - '@types/react-dom'
+      - supports-color
+
+  '@yourgpt/llm-sdk@2.1.8(@anthropic-ai/sdk@0.71.2(zod@3.25.76))(@google/generative-ai@0.24.1)(openai@6.16.0(ws@8.18.0)(zod@3.25.76))':
+    dependencies:
+      hono: 4.11.0
+      zod: 3.25.76
+    optionalDependencies:
+      '@anthropic-ai/sdk': 0.71.2(zod@3.25.76)
+      '@google/generative-ai': 0.24.1
+      openai: 6.16.0(ws@8.18.0)(zod@3.25.76)
+
   abort-controller@3.0.0:
     dependencies:
       event-target-shim: 5.0.1
@@ -12925,7 +13077,7 @@ snapshots:
       '@next/eslint-plugin-next': 16.0.10
       eslint: 9.39.2(jiti@2.6.1)
       eslint-import-resolver-node: 0.3.9
-      eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1))
+      eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0)(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-import: 2.32.0(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-jsx-a11y: 6.10.2(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-react: 7.37.5(eslint@9.39.2(jiti@2.6.1))
@@ -12946,7 +13098,7 @@ snapshots:
       eslint: 9.39.2(jiti@2.6.1)
       eslint-import-resolver-node: 0.3.9
       eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1))
-      eslint-plugin-import: 2.32.0(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1))
+      eslint-plugin-import: 2.32.0(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-jsx-a11y: 6.10.2(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-react: 7.37.5(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-react-hooks: 7.0.1(eslint@9.39.2(jiti@2.6.1))
@@ -12966,7 +13118,7 @@ snapshots:
       eslint: 9.39.2(jiti@2.6.1)
       eslint-import-resolver-node: 0.3.9
       eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1))
-      eslint-plugin-import: 2.32.0(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1))
+      eslint-plugin-import: 2.32.0(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-jsx-a11y: 6.10.2(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-react: 7.37.5(eslint@9.39.2(jiti@2.6.1))
       eslint-plugin-react-hooks: 7.0.1(eslint@9.39.2(jiti@2.6.1))
@@ -12988,7 +13140,7 @@ snapshots:
     transitivePeerDependencies:
       - supports-color
 
-  eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)):
+  eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)):
     dependencies:
       '@nolyfill/is-core-module': 1.0.39
       debug: 4.4.3
@@ -12999,11 +13151,11 @@ snapshots:
       tinyglobby: 0.2.15
       unrs-resolver: 1.11.1
     optionalDependencies:
-      eslint-plugin-import: 2.32.0(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1))
+      eslint-plugin-import: 2.32.0(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1))
     transitivePeerDependencies:
       - supports-color
 
-  eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)):
+  eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0)(eslint@9.39.2(jiti@2.6.1)):
     dependencies:
       '@nolyfill/is-core-module': 1.0.39
       debug: 4.4.3
@@ -13014,18 +13166,18 @@ snapshots:
       tinyglobby: 0.2.15
       unrs-resolver: 1.11.1
     optionalDependencies:
-      eslint-plugin-import: 2.32.0(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1))
+      eslint-plugin-import: 2.32.0(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1))
     transitivePeerDependencies:
       - supports-color
 
-  eslint-module-utils@2.12.1(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint-import-resolver-node@0.3.9)(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)):
+  eslint-module-utils@2.12.1(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint-import-resolver-node@0.3.9)(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1)):
     dependencies:
       debug: 3.2.7
     optionalDependencies:
       '@typescript-eslint/parser': 8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3)
       eslint: 9.39.2(jiti@2.6.1)
       eslint-import-resolver-node: 0.3.9
-      eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1))
+      eslint-import-resolver-typescript: 3.10.1(eslint-plugin-import@2.32.0)(eslint@9.39.2(jiti@2.6.1))
     transitivePeerDependencies:
       - supports-color
 
@@ -13050,7 +13202,7 @@ snapshots:
       doctrine: 2.1.0
       eslint: 9.39.2(jiti@2.6.1)
       eslint-import-resolver-node: 0.3.9
-      eslint-module-utils: 2.12.1(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint-import-resolver-node@0.3.9)(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1))
+      eslint-module-utils: 2.12.1(@typescript-eslint/parser@8.50.0(eslint@9.39.2(jiti@2.6.1))(typescript@5.9.3))(eslint-import-resolver-node@0.3.9)(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1))
       hasown: 2.0.2
       is-core-module: 2.16.1
       is-glob: 4.0.3
@@ -13068,7 +13220,7 @@ snapshots:
       - eslint-import-resolver-webpack
       - supports-color
 
-  eslint-plugin-import@2.32.0(eslint-import-resolver-typescript@3.10.1)(eslint@9.39.2(jiti@2.6.1)):
+  eslint-plugin-import@2.32.0(eslint-import-resolver-typescript@3.10.1(eslint-plugin-import@2.32.0(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)))(eslint@9.39.2(jiti@2.6.1)):
     dependencies:
       '@rtsao/scc': 1.1.0
       array-includes: 3.1.9
@@ -14436,6 +14588,10 @@ snapshots:
     dependencies:
       react: 18.3.1
 
+  lucide-react@0.561.0(react@19.2.3):
+    dependencies:
+      react: 19.2.3
+
   lucide-react@0.562.0(react@19.2.1):
     dependencies:
       react: 19.2.1
@@ -16312,6 +16468,26 @@ snapshots:
     transitivePeerDependencies:
       - supports-color
 
+  streamdown@2.1.0(react@19.2.3):
+    dependencies:
+      clsx: 2.1.1
+      hast-util-to-jsx-runtime: 2.3.6
+      html-url-attributes: 3.0.1
+      marked: 17.0.1
+      react: 19.2.3
+      rehype-harden: 1.1.7
+      rehype-raw: 7.0.0
+      rehype-sanitize: 6.0.0
+      remark-gfm: 4.0.1
+      remark-parse: 11.0.0
+      remark-rehype: 11.1.2
+      remend: 1.1.0
+      tailwind-merge: 3.4.0
+      unified: 11.0.5
+      unist-util-visit: 5.0.0
+    transitivePeerDependencies:
+      - supports-color
+
   strict-event-emitter@0.5.1: {}
 
   string-argv@0.3.2: {}
@@ -16815,6 +16991,10 @@ snapshots:
     dependencies:
       react: 18.3.1
 
+  use-stick-to-bottom@1.1.1(react@19.2.3):
+    dependencies:
+      react: 19.2.3
+
   use-sync-external-store@1.6.0(react@18.3.1):
     dependencies:
       react: 18.3.1