Skip to content

bin-bard/alpha-content-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Alpha Content Engine

Overview

Enterprise-grade content pipeline that transforms web documentation into intelligent AI assistants through automated scraping, processing, and deployment.

Core Achievements:

  • 30+ Articles Processed: Zendesk API → Clean Markdown with preserved structure
  • Zero-UI Deployment: 100% programmatic OpenAI API integration
  • Smart Delta Detection: Hash-based change tracking for efficient updates
  • Production Ready: Dockerized with daily automation and comprehensive logging

Technical Excellence:

  • API-First Architecture: No manual uploads, fully automated vector store management
  • Intelligent Chunking: File-based strategy optimized for accurate citations
  • Cost-Effective Scaling: GitHub Actions over expensive cloud platforms
  • Security Best Practices: Environment-based secrets, no hardcoded keys

System Architecture

Architecture

Screenshots

OpenAI Playground Answer - Strict Compliance Achieved

GPT-4o Response GPT-4o: Perfect adherence to "Only answer using uploaded docs" with proper citations

GPT-3.5-turbo Response Comparison: GPT-3.5-turbo shows hallucination vs. compliant behavior demonstration

GitHub Actions Deployment - Zero-Cost Automation

GitHub Actions Production Pipeline: Automated daily execution with assistant reuse and intelligent delta tracking

Docker Local Testing - One-Command Deployment

Docker Local Test Container Success: Clean execution showing scraping, processing, and exit 0 compliance

Quick Start

Setup:

# Windows: Copy template file and rename
copy .env.sample .env
# Edit .env file and replace with your actual API key:
# OPENAI_API_KEY=your-actual-openai-api-key

# Linux/Mac alternative:
# cp .env.sample .env

How to run locally:

# Install dependencies and run
pip install -r requirements.txt
python main.py

# Build and run Docker container (exits 0 as required)
docker build -t alpha-content-engine .
docker run -e OPENAI_API_KEY=your-api-key alpha-content-engine

Link to daily job logs:

All GitHub Actions Logs - Public repository with complete run history, logs, and downloadable artifacts including config files and scraper logs.

Note: First run will show "artifact not found" warning (expected behavior) as no previous config exists. Assistant reuse starts from second run onwards for optimal performance.

Chunking Strategy

File-based chunking: Each article = 1 file uploaded to OpenAI Vector Store

Benefits:

  • Preserves article structure for better context
  • Maintains URLs for accurate citations
  • Simple & reliable for support use cases
  • Optimal for retrieval accuracy

Process: HTML → Clean Markdown → Metadata footer → API upload

Logged: Files embedded in vector store + chunks processed count for full transparency

About

Content processing and vector synchronization system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors