Skip to content

streamyfin/streamyfin-ai-bot

Repository files navigation

Streamyfin AI Bot

A high-performance Discord AI bot built with Bun and Drizzle ORM featuring codebase embeddings, GitHub integration, and conversation history.

Tech Stack

  • Bun - Ultra-fast JavaScript runtime & server
  • Drizzle ORM - TypeScript-first SQL ORM
  • PostgreSQL + pgvecto.rs (VectorChord) - High-performance vector database supporting 3072-dimensional embeddings
  • discord.js - Discord bot client
  • Vercel AI SDK - OpenAI text-embedding-3-large integration
  • Octokit - GitHub REST API integration
  • Docker - Containerized deployment

Features

  • 🔍 Semantic Code Search - Natural language codebase queries
  • 👥 Repository Metadata - Automatically indexes contributors, maintainers, and project stats
  • 💬 Context-Aware Chat - Maintains last 100 messages per channel
  • 🔗 GitHub Integration - Fetch issues and PRs in real-time
  • Real-time Processing - Immediate message responses
  • 🚫 Read-Only - Information only, never modifies code
  • 🐳 Docker Ready - Easy deployment with docker-compose
  • 📊 Drizzle Studio - Visual database management

Documentation

Quick Start

Prerequisites

  • Bun installed
  • Docker & Docker Compose
  • Discord Bot Token
  • OpenAI API Key
  • GitHub Personal Access Token

1. Install

bun install

2. Configure Environment

cp env.example .env

Fill in your credentials in .env

3. Start Infrastructure

docker-compose -f docker-compose.local.yml up -d

4. Generate Drizzle Schema & Migrate

# Generate migrations from schema
bun run db:generate

# Run migrations
bun run db:migrate

5. Start Bot

bun start

6. Sync Streamyfin Codebase

Once the bot is running, trigger the sync:

curl -X POST http://localhost:3000/api/streamyfin/sync

This will fetch and embed the entire Streamyfin codebase from GitHub automatically (no local cloning needed).

7. Interact in Discord

Mention your bot:

@YourBot what is the authentication flow?

Commands

bun start                # Start bot server
bun dev                  # Start with auto-reload
bun run db:generate      # Generate Drizzle migrations
bun run db:migrate       # Run database migrations
bun run db:studio        # Open Drizzle Studio (database GUI)
bun run embeddings:init  # Generate embeddings
bun run bot:test         # Test bot connection only

Drizzle Studio

Visual database management UI:

bun run db:studio

Opens at https://local.drizzle.studio

View and edit:

  • Embeddings
  • Message history
  • GitHub cache

Database Schema (Drizzle)

Located in src/lib/db/schema.ts:

// Embeddings with vector search
export const embeddings = pgTable('embeddings', {
  id: serial('id').primaryKey(),
  filePath: text('file_path').notNull(),
  content: text('content').notNull(),
  vector: vector('vector', { dimensions: 1536 }),
  metadata: jsonb('metadata'),
  // ...
});

// Message history
export const messageHistory = pgTable('message_history', {
  id: serial('id').primaryKey(),
  channelId: text('channel_id').notNull(),
  messageId: text('message_id').notNull(),
  content: text('content').notNull(),
  // ...
});

// GitHub cache
export const githubCache = pgTable('github_cache', {
  id: serial('id').primaryKey(),
  cacheKey: text('cache_key').notNull().unique(),
  data: jsonb('data').notNull(),
  // ...
});

API Endpoints

Health Check

curl http://localhost:3000/health

Search Embeddings

curl -X POST http://localhost:3000/api/embeddings/search \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication", "limit": 5}'

Test Discord Message (Development)

curl -X POST http://localhost:3000/api/test/message \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Who is cagemaster?",
    "channelId": "test-channel",
    "username": "test-user"
  }'

This endpoint simulates a Discord message without needing Discord. Perfect for:

  • Testing bot responses locally
  • Debugging AI behavior
  • CI/CD integration tests
  • Rapid development iteration

Sync Streamyfin Codebase

curl -X POST http://localhost:3000/api/streamyfin/sync \
  -H "Content-Type: application/json" \
  -d '{
    "owner": "fredrikburmester",
    "repo": "streamyfin",
    "branch": "develop",
    "forceRegenerate": false
  }'

This will:

  1. Fetch repository metadata (contributors, stats, etc.)
  2. Fetch all repository files via GitHub API
  3. Generate embeddings for all files (on-demand, no cloning)
  4. Make the codebase searchable by the bot

Benefits:

  • ✅ No disk space needed
  • ✅ Always up-to-date
  • ✅ No git dependencies
  • ✅ Faster initial setup
  • ✅ Indexes contributor information

Generate Embeddings (For any repo)

curl -X POST http://localhost:3000/api/embeddings/generate \
  -H "Content-Type: application/json" \
  -d '{
    "owner": "username",
    "repo": "repository",
    "branch": "main",
    "forceRegenerate": false
  }'

Discord Bot Capabilities

The bot can:

  • Search codebase: "Find the user authentication code"
  • Get file content: "Show me the database schema"
  • Answer about contributors: "Who is cagemaster?", "Who are the main developers?"
  • Repository info: "How many contributors does the project have?"
  • List GitHub issues: "What are the open issues?"
  • Get specific issue: "Tell me about issue #123"
  • List PRs: "What pull requests are open?"
  • Get specific PR: "Details on PR #45"

Discord Setup

  1. Go to Discord Developer Portal
  2. Create application → Add Bot
  3. Enable intents:
    • ✅ MESSAGE CONTENT INTENT
    • ✅ SERVER MEMBERS INTENT
  4. Get credentials for .env
  5. Invite bot with this URL:
https://discord.com/api/oauth2/authorize?client_id=YOUR_APP_ID&permissions=274878221376&scope=bot

Project Structure

/
├── server.ts                 # Main Bun server
├── drizzle.config.ts         # Drizzle configuration
├── drizzle/                  # Generated migrations
├── src/
│   ├── lib/
│   │   ├── ai/              # AI chat, tools, prompts
│   │   ├── db/              
│   │   │   ├── schema.ts    # Drizzle schema
│   │   │   ├── client.ts    # Drizzle client
│   │   │   └── migrate.ts   # Migration runner
│   │   ├── discord/         # Discord bot client
│   │   ├── embeddings/      # Vector embeddings
│   │   ├── github/          
│   │   │   ├── api.ts       # GitHub API for file operations
│   │   │   └── client.ts # GitHub API (Octokit) for issues/PRs
│   │   └── message-history/ # Chat history
├── scripts/                  # Utility scripts
└── docker-compose.yml       # Multi-container setup

Production Deployment

Using Docker

docker-compose up -d

On Dokploy

  1. Push to Git repository
  2. Create new app in Dokploy
  3. Select Docker Compose deployment
  4. Set environment variables
  5. Deploy

Why VectorChord (pgvecto.rs)?

  • High-dimensional vectors - Supports full 3072 dimensions (vs pgvector's 2000 limit)
  • Better performance - Rust-based, optimized for large-scale vector operations
  • HNSW indexing - Fast approximate nearest neighbor search
  • PostgreSQL native - Drop-in replacement for pgvector
  • Better for AI - Perfect for text-embedding-3-large (3072 dims)

Why Drizzle?

  • TypeScript-first - Full type safety
  • SQL-like API - If you know SQL, you know Drizzle
  • Zero dependencies - Lightweight and fast
  • Serverless-ready - Perfect for edge deployments
  • Drizzle Studio - Visual database management
  • Auto-migrations - Generate migrations from schema

Performance

  • ⚡ ~3x faster server startup vs Node.js
  • ⚡ ~2x faster HTTP requests
  • ⚡ Native TypeScript support (no transpilation)
  • ⚡ Type-safe queries with Drizzle

Configuration

System Prompt

Edit src/lib/ai/prompt.ts

Embedding Settings

Edit src/lib/embeddings/chunker.ts

Current settings:

  • Model: text-embedding-3-large (3072 dimensions)
  • Chunk size: 24,000 characters (~6000 tokens)
  • Overlap: 1,200 characters (~300 tokens)

Database Schema

Edit src/lib/db/schema.ts then run bun run db:generate

Troubleshooting

Bot offline in Discord

  • Enable MESSAGE CONTENT INTENT in Discord Dev Portal
  • Verify DISCORD_TOKEN is correct

Database errors

# Check PostgreSQL
docker ps | grep postgres

# Open Drizzle Studio
bun run db:studio

Regenerate migrations

bun run db:generate
bun run db:migrate

Learn More

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published