Streamyfin AI Bot

A high-performance Discord AI bot built with Bun and Drizzle ORM featuring codebase embeddings, GitHub integration, and conversation history.

Tech Stack

Bun - Ultra-fast JavaScript runtime & server
Drizzle ORM - TypeScript-first SQL ORM
PostgreSQL + pgvecto.rs (VectorChord) - High-performance vector database supporting 3072-dimensional embeddings
discord.js - Discord bot client
Vercel AI SDK - OpenAI text-embedding-3-large integration
Octokit - GitHub REST API integration
Docker - Containerized deployment

Features

🔍 Semantic Code Search - Natural language codebase queries
👥 Repository Metadata - Automatically indexes contributors, maintainers, and project stats
💬 Context-Aware Chat - Maintains last 100 messages per channel
🔗 GitHub Integration - Fetch issues and PRs in real-time
⚡ Real-time Processing - Immediate message responses
🚫 Read-Only - Information only, never modifies code
🐳 Docker Ready - Easy deployment with docker-compose
📊 Drizzle Studio - Visual database management

Documentation

Quick Start

Prerequisites

Bun installed
Docker & Docker Compose
Discord Bot Token
OpenAI API Key
GitHub Personal Access Token

1. Install

bun install

2. Configure Environment

cp env.example .env

Fill in your credentials in .env

3. Start Infrastructure

docker-compose -f docker-compose.local.yml up -d

4. Generate Drizzle Schema & Migrate

# Generate migrations from schema
bun run db:generate

# Run migrations
bun run db:migrate

5. Start Bot

bun start

6. Sync Streamyfin Codebase

Once the bot is running, trigger the sync:

curl -X POST http://localhost:3000/api/streamyfin/sync

This will fetch and embed the entire Streamyfin codebase from GitHub automatically (no local cloning needed).

7. Interact in Discord

Mention your bot:

@YourBot what is the authentication flow?

Commands

bun start                # Start bot server
bun dev                  # Start with auto-reload
bun run db:generate      # Generate Drizzle migrations
bun run db:migrate       # Run database migrations
bun run db:studio        # Open Drizzle Studio (database GUI)
bun run embeddings:init  # Generate embeddings
bun run bot:test         # Test bot connection only

Drizzle Studio

Visual database management UI:

bun run db:studio

Opens at https://local.drizzle.studio

View and edit:

Embeddings
Message history
GitHub cache

Database Schema (Drizzle)

Located in src/lib/db/schema.ts:

// Embeddings with vector search
export const embeddings = pgTable('embeddings', {
  id: serial('id').primaryKey(),
  filePath: text('file_path').notNull(),
  content: text('content').notNull(),
  vector: vector('vector', { dimensions: 1536 }),
  metadata: jsonb('metadata'),
  // ...
});

// Message history
export const messageHistory = pgTable('message_history', {
  id: serial('id').primaryKey(),
  channelId: text('channel_id').notNull(),
  messageId: text('message_id').notNull(),
  content: text('content').notNull(),
  // ...
});

// GitHub cache
export const githubCache = pgTable('github_cache', {
  id: serial('id').primaryKey(),
  cacheKey: text('cache_key').notNull().unique(),
  data: jsonb('data').notNull(),
  // ...
});

API Endpoints

Health Check

curl http://localhost:3000/health

Search Embeddings

curl -X POST http://localhost:3000/api/embeddings/search \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication", "limit": 5}'

Test Discord Message (Development)

curl -X POST http://localhost:3000/api/test/message \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Who is cagemaster?",
    "channelId": "test-channel",
    "username": "test-user"
  }'

This endpoint simulates a Discord message without needing Discord. Perfect for:

Testing bot responses locally
Debugging AI behavior
CI/CD integration tests
Rapid development iteration

Sync Streamyfin Codebase

curl -X POST http://localhost:3000/api/streamyfin/sync \
  -H "Content-Type: application/json" \
  -d '{
    "owner": "fredrikburmester",
    "repo": "streamyfin",
    "branch": "develop",
    "forceRegenerate": false
  }'

This will:

Fetch repository metadata (contributors, stats, etc.)
Fetch all repository files via GitHub API
Generate embeddings for all files (on-demand, no cloning)
Make the codebase searchable by the bot

Benefits:

✅ No disk space needed
✅ Always up-to-date
✅ No git dependencies
✅ Faster initial setup
✅ Indexes contributor information

Generate Embeddings (For any repo)

curl -X POST http://localhost:3000/api/embeddings/generate \
  -H "Content-Type: application/json" \
  -d '{
    "owner": "username",
    "repo": "repository",
    "branch": "main",
    "forceRegenerate": false
  }'

Discord Bot Capabilities

The bot can:

Search codebase: "Find the user authentication code"
Get file content: "Show me the database schema"
Answer about contributors: "Who is cagemaster?", "Who are the main developers?"
Repository info: "How many contributors does the project have?"
List GitHub issues: "What are the open issues?"
Get specific issue: "Tell me about issue #123"
List PRs: "What pull requests are open?"
Get specific PR: "Details on PR #45"

Discord Setup

Go to Discord Developer Portal
Create application → Add Bot
Enable intents:
- ✅ MESSAGE CONTENT INTENT
- ✅ SERVER MEMBERS INTENT
Get credentials for .env
Invite bot with this URL:

https://discord.com/api/oauth2/authorize?client_id=YOUR_APP_ID&permissions=274878221376&scope=bot

Project Structure

/
├── server.ts                 # Main Bun server
├── drizzle.config.ts         # Drizzle configuration
├── drizzle/                  # Generated migrations
├── src/
│   ├── lib/
│   │   ├── ai/              # AI chat, tools, prompts
│   │   ├── db/              
│   │   │   ├── schema.ts    # Drizzle schema
│   │   │   ├── client.ts    # Drizzle client
│   │   │   └── migrate.ts   # Migration runner
│   │   ├── discord/         # Discord bot client
│   │   ├── embeddings/      # Vector embeddings
│   │   ├── github/          
│   │   │   ├── api.ts       # GitHub API for file operations
│   │   │   └── client.ts # GitHub API (Octokit) for issues/PRs
│   │   └── message-history/ # Chat history
├── scripts/                  # Utility scripts
└── docker-compose.yml       # Multi-container setup

Production Deployment

Using Docker

docker-compose up -d

On Dokploy

Push to Git repository
Create new app in Dokploy
Select Docker Compose deployment
Set environment variables
Deploy

Why VectorChord (pgvecto.rs)?

✅ High-dimensional vectors - Supports full 3072 dimensions (vs pgvector's 2000 limit)
✅ Better performance - Rust-based, optimized for large-scale vector operations
✅ HNSW indexing - Fast approximate nearest neighbor search
✅ PostgreSQL native - Drop-in replacement for pgvector
✅ Better for AI - Perfect for text-embedding-3-large (3072 dims)

Why Drizzle?

✅ TypeScript-first - Full type safety
✅ SQL-like API - If you know SQL, you know Drizzle
✅ Zero dependencies - Lightweight and fast
✅ Serverless-ready - Perfect for edge deployments
✅ Drizzle Studio - Visual database management
✅ Auto-migrations - Generate migrations from schema

Performance

⚡ ~3x faster server startup vs Node.js
⚡ ~2x faster HTTP requests
⚡ Native TypeScript support (no transpilation)
⚡ Type-safe queries with Drizzle

Configuration

System Prompt

Edit src/lib/ai/prompt.ts

Embedding Settings

Edit src/lib/embeddings/chunker.ts

Current settings:

Model: text-embedding-3-large (3072 dimensions)
Chunk size: 24,000 characters (~6000 tokens)
Overlap: 1,200 characters (~300 tokens)

Database Schema

Edit src/lib/db/schema.ts then run bun run db:generate

Troubleshooting

Bot offline in Discord

Enable MESSAGE CONTENT INTENT in Discord Dev Portal
Verify DISCORD_TOKEN is correct

Database errors

# Check PostgreSQL
docker ps | grep postgres

# Open Drizzle Studio
bun run db:studio

Regenerate migrations

bun run db:generate
bun run db:migrate

Learn More

Quick Start Guide - Detailed setup instructions
Drizzle ORM Guide - Database operations and patterns

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
codebase		codebase
docs		docs
drizzle		drizzle
scripts		scripts
src/lib		src/lib
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
DRIZZLE_GUIDE.md		DRIZZLE_GUIDE.md
Dockerfile		Dockerfile
QUICKSTART.md		QUICKSTART.md
README.md		README.md
bun.lock		bun.lock
docker-compose.local.yml		docker-compose.local.yml
docker-compose.yml		docker-compose.yml
drizzle.config.ts		drizzle.config.ts
env.example		env.example
package.json		package.json
server.ts		server.ts
tsconfig.json		tsconfig.json

streamyfin/streamyfin-ai-bot

Folders and files

Latest commit

History

Repository files navigation