Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .cursor/debug.log
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{"location":"map-storage-commit-detail.ts:buildCommitDetailPayload","message":"pierre first file raw prefix","data":{"path":"test.ts","state":"deleted","rawLen":134,"rawPrefix":"diff --git a/test.ts b/test.ts\ndeleted file mode 100644\nindex 54b82a0..0000000\n--- a/test.ts\n+++ /dev/null\n@@ -1 +0,0 @@\n-const a = 1;"},"timestamp":1774205454126,"hypothesisId":"H1-H4"}
{"location":"map-storage-commit-detail.ts:buildCommitDetailPayload","message":"pierre first file raw prefix","data":{"path":"test.ts","state":"deleted","rawLen":134,"rawPrefix":"diff --git a/test.ts b/test.ts\ndeleted file mode 100644\nindex 54b82a0..0000000\n--- a/test.ts\n+++ /dev/null\n@@ -1 +0,0 @@\n-const a = 1;"},"timestamp":1774205459285,"hypothesisId":"H1-H4"}
96 changes: 90 additions & 6 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,95 @@
# Better Hub AGENTS.md
# Better Hub -- Agent Documentation

## Production Information
Better Hub is a reimagined GitHub UI for code collaboration, built by the Better Auth team. It is a Next.js 16 / React 19 monorepo that proxies the GitHub API, adds AI features (Ghost assistant), and provides a faster, keyboard-driven experience for repos, PRs, issues, and CI/CD.

- The origin for better-hub is: https://better-hub.com
in
## Production

- Live site: https://better-hub.com

## Design

- Try to follow the design of the rest of the site as much as possible.
- Avoid loading spinners and prefer skeleton UI for loading states.
- Match the design language of the rest of the app.
- Prefer skeleton UI over loading spinners for loading states.

## How to Use These Docs

The `agent-docs/` directory contains detailed documentation organized by topic. Start with the section most relevant to your task. If you need a broad understanding, read **architecture/overview.md** first.

## Keeping Docs Up to Date

When you make changes to the codebase that affect architecture, patterns, configuration, or conventions documented in `agent-docs/`, update the relevant doc files as part of the same change. Examples of when to update:

- Adding or removing a page, API route, or component directory -- update `architecture/project-structure.md` and `frontend/components.md`
- Changing the database schema -- update `data-layer/database.md`
- Adding or modifying environment variables -- update `infrastructure/environment.md`
- Changing the auth flow, caching strategy, or data-fetching patterns -- update the relevant feature or data-layer doc
- Introducing a new major dependency or tool -- update `architecture/overview.md`
- Adding a new feature area -- consider creating a new doc file under the appropriate directory

If a change is significant enough that it would surprise a future agent reading the current docs, the docs need updating. Keep descriptions concise and factual -- document what exists and how it works, not aspirational plans.

## Documentation Index

### Architecture

- [agent-docs/architecture/overview.md](agent-docs/architecture/overview.md) -- Tech stack, monorepo layout, key dependencies, how systems connect
- [agent-docs/architecture/project-structure.md](agent-docs/architecture/project-structure.md) -- Full directory tree with annotations for every folder and key file
- [agent-docs/architecture/data-flow.md](agent-docs/architecture/data-flow.md) -- Request lifecycle, the `localFirstGitRead` caching pattern, AI data flow, mutation flow

### Features

- [agent-docs/features/ghost-ai.md](agent-docs/features/ghost-ai.md) -- Ghost AI assistant: model routing, ~30 tools, conversation persistence, semantic search, E2B sandboxes
- [agent-docs/features/github-integration.md](agent-docs/features/github-integration.md) -- Octokit REST client, ~30 sync job types, shared vs per-user cache security, rate limiting, OAuth scopes
- [agent-docs/features/pr-reviews.md](agent-docs/features/pr-reviews.md) -- PR review system: diff viewing, inline comments, AI summaries, merge panel, conflict resolution, CI checks
- [agent-docs/features/billing.md](agent-docs/features/billing.md) -- Stripe metered billing, credit system, spending limits, AI model pricing, usage tracking

### Data Layer

- [agent-docs/data-layer/database.md](agent-docs/data-layer/database.md) -- Prisma schema (20 models), connection pool config, migration commands
- [agent-docs/data-layer/caching.md](agent-docs/data-layer/caching.md) -- Multi-tier Redis caching: per-user, shared, repo data (TTL tiers), Vercel cache, DB fallback
- [agent-docs/data-layer/github-sync.md](agent-docs/data-layer/github-sync.md) -- Background sync job system: job lifecycle, deduplication, draining, ETag support

### Authentication

- [agent-docs/auth/authentication.md](agent-docs/auth/authentication.md) -- better-auth config, GitHub OAuth, `getServerSession()`, session/cookie handling, PAT sign-in, scopes

### Frontend

- [agent-docs/frontend/routing.md](agent-docs/frontend/routing.md) -- GitHub-compatible URL rewriting, middleware, git protocol redirects, route groups
- [agent-docs/frontend/components.md](agent-docs/frontend/components.md) -- Component organization by domain (~150 components), server vs client patterns
- [agent-docs/frontend/ui-patterns.md](agent-docs/frontend/ui-patterns.md) -- TailwindCSS 4, Radix UI, CVA, Shiki, TipTap, theming, keyboard shortcuts, hooks

### Infrastructure

- [agent-docs/infrastructure/development.md](agent-docs/infrastructure/development.md) -- Local setup, Docker Compose, dev scripts, linting, TypeScript config, testing
- [agent-docs/infrastructure/deployment.md](agent-docs/infrastructure/deployment.md) -- Vercel hosting, GitHub Actions CI, Sentry, security headers, build process
- [agent-docs/infrastructure/environment.md](agent-docs/infrastructure/environment.md) -- All environment variables categorized with descriptions

## Quick Reference

### Critical Files

| File | Purpose |
| ---------------------------------------- | ----------------------------------------------------- |
| `apps/web/src/lib/github.ts` | All GitHub API data fetching (~7300 lines) |
| `apps/web/src/lib/auth.ts` | Authentication config and `getServerSession()` |
| `apps/web/src/lib/db.ts` | Database client and connection pool |
| `apps/web/src/proxy.ts` | Middleware (auth + URL rewriting) |
| `apps/web/src/app/api/ai/ghost/route.ts` | Ghost AI endpoint (~3500 lines) |
| `apps/web/prisma/schema/` | Database schema (multi-file: auth.prisma, app.prisma) |
| `apps/web/next.config.ts` | Next.js configuration |
| `apps/web/.env.example` | Environment variable template |

### Common Tasks

**Adding a new page**: Create `page.tsx` in `apps/web/src/app/(app)/your-route/`. It will automatically get the app layout (navbar, Ghost, auth).

**Adding a new API route**: Create `route.ts` in `apps/web/src/app/api/your-endpoint/`. Use `getServerSession()` for auth and `getOctokitFromSession()` for GitHub API access.

**Adding a new component**: Place in the appropriate feature directory under `apps/web/src/components/`. Use `"use client"` only if the component needs interactivity.

**Fetching GitHub data**: Add a function in `apps/web/src/lib/github.ts` using the `localFirstGitRead` pattern. Define the cache key, job type, and remote fetcher.

**Adding a database model**: Add to the appropriate file in `apps/web/prisma/schema/` (`auth.prisma` for better-auth tables, `app.prisma` for everything else), run `bunx prisma migrate dev --name your_migration`, then `bunx prisma generate`.

**Running checks before PR**: Run `bun check` from the repo root (lint + format + typecheck).
153 changes: 153 additions & 0 deletions agent-docs/architecture/data-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# Data Flow

This document describes how data moves through Better Hub from an incoming HTTP request to a rendered page.

## Request Lifecycle

### 1. Middleware (`src/proxy.ts`)

Every request first passes through Next.js middleware which handles three concerns:

**Git protocol redirect** -- If the URL matches a git service path (`info/refs?service=git-upload-pack`, `git-receive-pack`), the request is redirected to `github.com` with a 307. This lets `git clone` and `git push` work transparently.

**Authentication** -- Public paths (`/`, `/api/auth`, `/api/inngest`) are allowed through. All other paths require a `better-auth` session cookie. Missing sessions redirect to `/`.

**URL rewriting** -- GitHub-compatible URLs are rewritten to internal App Router paths:
- `/:owner/:repo` -> `/repos/:owner/:repo`
- `/:owner/:repo/pull/:number` -> `/repos/:owner/:repo/pulls/:number`
- `/:owner/:repo/commit/:sha` -> `/repos/:owner/:repo/commits/:sha`
- `/:owner/:repo/compare/base...head` -> `/repos/:owner/:repo/pulls/new?base=&head=`

The `APP_ROUTES` set prevents rewriting known first-segment routes (`dashboard`, `repos`, `api`, `_next`, etc.).

### 2. App Layout (`src/app/(app)/layout.tsx`)

The `(app)` route group layout runs for every authenticated page:

1. Calls `getServerSession()` which is wrapped in React `cache()` for request deduplication
2. If no session exists, redirects to `/` with a `?redirect=` parameter
3. Fetches notifications via `getNotifications()`
4. Checks onboarding status and star state for first-run overlay
5. Wraps children in providers: `NuqsAdapter`, `GlobalChatProvider`, `MutationEventProvider`, `ColorThemeProvider`, `GitHubLinkInterceptor`, `TooltipProvider`
6. Renders: navbar, navigation progress bar, nav-aware content area, Ghost chat panel, onboarding overlay

### 3. Repo Layout (`src/app/(app)/repos/[owner]/[repo]/layout.tsx`)

For repository pages, a nested layout provides:

1. Fetches repo page data via `getRepoPageData()` (repo metadata, nav counts, star status, org membership, latest commit)
2. Loads cached data in parallel: file tree, contributor avatars, languages, branches, tags
3. Prefetches PR data in the background via `waitUntil(prefetchPRData())`
4. Renders: sidebar (description, stats, contributors, languages), repo nav tabs, code content wrapper with file tree and branch selector

### 4. Page Components

Individual pages fetch their specific data using functions from `src/lib/github.ts`. These all use the `localFirstGitRead` pattern described below.

## GitHub Data Fetching: `localFirstGitRead` Pattern

The core data-fetching pattern prioritizes local cache for speed while keeping data fresh via background sync:

```
1. Check Redis cache (gh:{userId}:{cacheKey})
├── HIT with fresh data → return immediately
└── MISS or stale
2. Check shared cache (for public data types)
├── HIT → return + enqueue background sync job
└── MISS
3. Fetch from GitHub API (Octokit)
├── SUCCESS → update cache, return data
└── FAILURE (rate limit, network)
4. Fall back to DB cache (github_cache_entries table)
├── HIT → return stale data
└── MISS → return fallback value
```

**Security model**: Only certain data types are shareable across users (branches, tags, releases, issues, PRs, contributors, etc. -- defined in `SHAREABLE_CACHE_TYPES`). Private repo data and user-specific data is always scoped to the requesting user's cache key.

### Background Sync Jobs

When data is served from cache and may be stale, a sync job is enqueued:

1. Jobs are deduplicated by `(userId, dedupeKey)` -- only one pending job per user per data type
2. The `drainGithubSyncJobs()` function claims and processes jobs for a user
3. Jobs are processed with the user's GitHub token, updating both Redis and DB caches
4. Failed jobs are retried up to 8 times with backoff
5. Running jobs have a 10-minute timeout to prevent stuck jobs

## AI Data Flow

### Ghost Chat (`/api/ai/ghost`)

```
Client message
├── Check usage limits (credits, spending cap)
├── Resolve model (user preference or "auto" → default model)
├── Load/create conversation from DB
streamText() with tools
├── GitHub tools (via user's Octokit)
│ ├── get_repo_info, list_issues, list_prs
│ ├── get_issue, get_pull_request, get_file_content
│ ├── create_issue, create_pr, add_comment
│ ├── merge_pr, create_branch, update_file
│ └── ... (~30 tools)
├── Search tools
│ ├── search_repos, search_code, search_issues
│ └── semantic_search (Mixedbread embeddings + reranking)
├── Code execution (E2B sandbox)
└── Navigation tools (generate Better Hub URLs)
Stream response to client
├── Save messages to DB (chat_messages)
├── Log token usage (ai_call_logs + usage_logs)
└── Report to Stripe (metered billing)
```

### Embedding Pipeline (Background)

```
User views PR/Issue
Inngest event: app/content.viewed
embedContent function
├── Embed title + body (Mixedbread mxbai-embed-large-v1)
├── Embed comments in batches of 20
├── Embed reviews
└── Store in search_embeddings table (with content hash for dedup)
```

## Caching Tiers

| Tier | TTL | Use Case | Key Pattern |
|---|---|---|---|
| Per-user Redis | Varies | User-specific GitHub data | `gh:{userId}:{cacheKey}` |
| Shared Redis | Varies | Public repo data (branches, tags, etc.) | `shared:{cacheKey}` |
| Repo data cache | 24h / 1h / 5min | Languages, branches, file tree, events | `repo_*:{owner}/{repo}` |
| Vercel cache | `unstable_cache` | Server component data revalidation | Function-based |
| DB cache | Permanent | Fallback when GitHub API is unavailable | `github_cache_entries` table |
| README cache | Medium TTL | Rendered README content | `readme:{owner}/{repo}` |

## Mutation Flow

When users perform write actions (create issue, merge PR, add comment), the flow is:

1. Client calls a mutation function (often via `use-mutation.ts` hook)
2. The function calls the GitHub API directly via Octokit
3. On success, relevant caches are invalidated (`invalidateIssueCache`, `invalidatePullRequestCache`, etc.)
4. A mutation event is dispatched via `MutationEventProvider` to update other components on the page
5. `use-mutation-subscription.ts` hooks in other components react to the event and refetch data
95 changes: 95 additions & 0 deletions agent-docs/architecture/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Architecture Overview

Better Hub is a reimagined GitHub UI for code collaboration, built by the Better Auth team. It provides a faster, more pleasant experience for browsing repos, reviewing PRs, triaging issues, and interacting with an AI assistant (Ghost).

Production URL: `https://www.better-hub.com`

## Tech Stack

| Layer | Technology | Version |
|---|---|---|
| Framework | Next.js (App Router) | 16 |
| UI | React | 19 |
| Styling | TailwindCSS | 4 |
| Component primitives | Radix UI | - |
| ORM | Prisma | 7 |
| Database | PostgreSQL | 16 |
| Cache | Redis via Upstash REST | 7 |
| Package manager | Bun | 1.3.5 |
| Linter | oxlint | - |
| Formatter | oxfmt | - |
| Language | TypeScript (strict) | 5.7+ |
| Error tracking | Sentry | 10 |
| Hosting | Vercel | - |

## Monorepo Layout

The repo uses Bun workspaces with three packages:

- **`apps/web`** -- The main Next.js application. Contains all pages, API routes, components, and server-side logic.
- **`packages/chrome-extension`** -- Chrome Manifest V3 extension that adds "Open in Better Hub" buttons on GitHub pages and optionally redirects GitHub URLs.
- **`packages/firefox-extension`** -- Firefox equivalent of the Chrome extension.

## Key Dependencies

### GitHub Integration
- `@octokit/rest` -- REST client for all GitHub API calls
- `better-auth` -- Authentication library (built by the same team) with GitHub OAuth

### AI / ML
- `@openrouter/ai-sdk-provider` + `ai` (Vercel AI SDK) -- Model routing and streaming for Ghost AI
- `@ai-sdk/anthropic` -- Anthropic provider for specific AI tasks
- `@mixedbread-ai/sdk` -- Embedding generation and reranking for semantic search
- `supermemory` -- Long-term AI conversation memory
- `e2b` -- Sandboxed code execution environments

### Billing
- `stripe` -- Metered billing and subscriptions
- `@better-auth/stripe` -- Stripe plugin for better-auth

### Background Jobs
- `inngest` -- Durable background functions (embedding content, retrying Stripe usage reports)

### UI
- `radix-ui` -- Accessible UI primitives (dialog, dropdown, tooltip, popover, etc.)
- `cmdk` -- Command palette (`Cmd+K`)
- `shiki` -- Syntax highlighting for code blocks and diffs
- `@tiptap/*` -- Rich text editor for comments and markdown
- `motion` -- Animations
- `lucide-react` -- Icons
- `react-markdown` + remark/rehype plugins -- Markdown rendering
- `nuqs` -- URL query state management
- `next-themes` -- Theme switching

### Data
- `@prisma/client` + `@prisma/adapter-pg` -- ORM with native PostgreSQL adapter
- `@upstash/redis` -- Redis REST client for caching
- `pg` -- PostgreSQL connection pool
- `zod` -- Schema validation

## How They Connect

```
Browser ──► Next.js Middleware (proxy.ts)
├── URL rewriting (GitHub-compatible routes)
├── Authentication check (better-auth session cookie)
App Router (React Server Components)
├── getServerSession() ──► better-auth ──► PostgreSQL
├── GitHub data fetching ──► localFirstGitRead pattern
│ │
│ ├── Redis cache (Upstash)
│ ├── DB cache (github_cache_entries)
│ └── GitHub API (Octokit) + background sync jobs
├── AI endpoints ──► OpenRouter / Anthropic
│ │
│ ├── Tool calls (GitHub API via Octokit)
│ ├── E2B sandbox (code execution)
│ └── Semantic search (Mixedbread embeddings)
└── Billing ──► Stripe (metered usage)
```
Loading
Loading