A modern, enterprise-grade AI content generation platform featuring multi-framework support, MCP (Model Context Protocol) integration, comprehensive cost tracking, and full-stack web interface. Built with FastAPI, React, and designed for production deployment.
- Multi-Framework AI Generation: Three distinct AI frameworks for different use cases
- Semantic Kernel: User-friendly, encouraging content creation
- LangChain: Structured, analytical text processing
- LangGraph: Complex multi-step workflows with iterative refinement
- Dual Interface Support: Story generation and conversational chat with context management
- Universal Provider Compatibility: Works with any OpenAI-compatible API (Azure OpenAI, OpenRouter, Ollama, custom endpoints)
- Embedded MCP Server: Expose story generation as MCP tools for integration with Claude Desktop and other MCP clients
- All Framework Access: Each AI framework available as separate MCP tool
- HTTP-based Protocol: Easy integration at
/mcpendpoint
- React Frontend (
/frontendReact/): Modern SPA with TypeScript, Vite, Tailwind CSS- Real-time data fetching with React Query
- Comprehensive TypeScript support
- Mobile-responsive design
- Advanced form handling and validation
- Dual Frontend Architecture: Choose between React for complex UIs or simple HTML for lightweight deployments
- Context Management: Upload files and execute prompts with contextual understanding
- Comprehensive Cost Tracking:
- Individual transaction monitoring
- Token usage analytics
- Cost breakdowns by framework and model
- Detailed performance metrics
- Chat Management:
- Persistent conversation history
- Framework switching mid-conversation
- Conversation search and management
- Story Management:
- Full story history and search
- Character-based story filtering
- Export capabilities (copy, download)
- Story preview and full view modes
- Modern Package Management: UV for 10-100x faster dependency installation and resolution
- Structured Logging: JSON-formatted logs with request tracking and performance metrics
- Web-based Log Viewer: Real-time log monitoring with filtering and search
- Database Flexibility: SQLite for development, PostgreSQL-ready for production
- Docker Support: Multi-stage builds with separate frontend/backend containers
- Health Monitoring: Comprehensive health checks and error tracking
- Security: Input validation, CORS configuration, rate limiting, secure API key management
- Comprehensive Testing: 264 total tests with 34/34 unit tests passing, full pytest integration
- Rate Limiting: SlowAPI-based rate limiting with per-endpoint controls (60 req/min per IP)
- Container Ready: Fully functional Docker setup with multi-service deployment
- ✅ Environment variable validation at startup
- ✅ CORS middleware with configurable origins
- ✅ Input sanitization and validation
- ✅ Request size limits
- ✅ Connection pooling for API clients
- ✅ Lazy service initialization
- ✅ Async/await throughout
- ✅ Request timeout handling
- ✅ Retry logic with exponential backoff
- ✅ Comprehensive error handling
- ✅ Structured logging with request IDs
- ✅ Health check endpoint
- UV Package Manager (primary) or pip (fallback support)
- Python 3.11+ (recommended)
- FastAPI 0.109.0+
- Pydantic V2 (2.5.3+)
- SQLAlchemy with Alembic for database migrations
- FastMCP for Model Context Protocol integration
- 164+ packages automatically managed by UV
- Node.js 18+
- npm or yarn
- Modern browser with ES2020+ support
- Clone the repository
- Copy
.env.exampleto.env:
cp .env.example .env- Configure your provider:
# Your API key for authentication
PROVIDER_API_KEY=your-api-key
# Base URL for the API (must be OpenAI-compatible)
PROVIDER_API_BASE_URL=https://api.your-provider.com/v1
# Model name/identifier
PROVIDER_MODEL=your-model-name
# Display name for your provider
PROVIDER_NAME=Your Provider NameExample configurations:
For Ollama (local):
PROVIDER_API_BASE_URL=http://localhost:11434/v1
PROVIDER_MODEL=llama2
PROVIDER_NAME=Ollama LocalFor Tachyon LLM:
PROVIDER_API_BASE_URL=https://api.tachyon.ai/v1
PROVIDER_MODEL=tachyon-fast
PROVIDER_NAME=Tachyon LLM- Install dependencies:
# UV is now the primary package manager for this project
# Create virtual environment with Python 3.11
uv venv --python 3.11
# Activate the environment
source .venv/Scripts/activate # Windows
# OR
source .venv/bin/activate # macOS/Linux
# Install all dependencies (main + dev + test)
uv pip install -e ".[dev,test]"
# Generate lock file for reproducible builds
uv lock# Traditional method (slower)
pip install -r requirements.txtWhy UV?
- ⚡ 10-100x faster than pip
- 🔒 Better dependency resolution
- 📦 Built-in virtual environment management
- 🔄 Reproducible builds with lock files
- 🚀 Modern Python package management
See UV_SETUP.md for detailed UV usage instructions.
- Run the application:
# Full application (backend + embedded MCP server)
python backend/main.py
# For React frontend development (separate terminals)
# Terminal 1: Backend
python backend/main.py
# Terminal 2: React Frontend
cd frontendReact
npm install
npm run dev# Primary deployment method - multi-service architecture
docker-compose up --build
# Clean build from scratch
docker-compose down --volumes && docker-compose up --build
# Check container status
docker-compose ps
# View logs
docker-compose logs ai-content-platform # Backend logs
docker-compose logs react-frontend # Frontend logs
# Manual builds
docker build -t ai-content-platform . # Backend
cd frontendReact && docker build -t react-frontend . # FrontendServices:
- Backend: http://localhost:8000 (FastAPI + MCP Server)
- Frontend: http://localhost:3000 (React SPA)
- API Docs: http://localhost:8000/docs
All configuration is done via environment variables. See .env.example for available options:
PROVIDER_NAME: Provider identifier (e.g., "openai", "openrouter", "custom")PROVIDER_API_KEY: Your API key for authenticationPROVIDER_API_BASE_URL: Base URL for API calls (must be OpenAI-compatible)PROVIDER_MODEL: Model identifier (e.g., 'gpt-4', 'llama2')PROVIDER_API_TYPE: API compatibility type (default: "openai")PROVIDER_HEADERS: Additional HTTP headers as JSON (optional)
CUSTOM_VAR: Custom string variable for provider-specific data- Headers are automatically generated based on provider type:
- OpenRouter: Empty headers (auth handled by AsyncOpenAI client)
- Custom: Extended headers with app info, debug flags, and CUSTOM_VAR
- Other providers: Minimal headers with app identification
- See
services/base_ai_service.pyfor implementation details
DEBUG_MODE: Enable debug logging and docs (default: false)CORS_ORIGINS: Allowed CORS origins (default: ["http://localhost:8000"])MAX_REQUEST_SIZE: Maximum request size in bytes (default: 1048576)
LOG_FILE_PATH: Path to log file (default: "logs/app.log")LOG_LEVEL: Logging level - DEBUG, INFO, WARNING, ERROR (default: "INFO")LOG_ROTATION_HOURS: Hours between log file rotation (default: 1)LOG_RETENTION_DAYS: Days to retain old log files (default: 7)
GET /health: Health check endpointGET /api/provider: Get current AI provider information
POST /api/semantic-kernel: Generate story using Semantic KernelPOST /api/langchain: Generate story using LangChainPOST /api/langgraph: Generate story using LangGraphGET /api/stories: List all stories with paginationGET /api/stories/{id}: Get specific storyGET /api/stories/search/characters: Search stories by characterDELETE /api/stories: Delete all stories
POST /api/chat/semantic-kernel: Chat using Semantic KernelPOST /api/chat/langchain: Chat using LangChainPOST /api/chat/langgraph: Chat using LangGraphGET /api/chat/conversations: List chat conversationsGET /api/chat/conversations/{id}: Get specific conversationDELETE /api/chat/conversations/{id}: Delete conversationDELETE /api/chat/conversations: Delete all conversations
GET /api/cost/usage: Get usage summaries by date rangeGET /api/cost/transactions: Get individual transaction detailsDELETE /api/cost/usage: Clear all usage data
POST /api/context/upload: Upload context filesGET /api/context/files: List uploaded filesDELETE /api/context/files/{id}: Delete context filePOST /api/context/execute: Execute prompt with contextGET /api/context/executions: Get execution history
GET /mcp: MCP server endpoint for tool discoveryPOST /mcp: MCP tool execution
GET /logs: Web-based log viewer interfaceGET /logs/files: List available log filesGET /logs/entries/{file_path}: Get paginated log entries
{
"primary_character": "Santa Claus",
"secondary_character": "Rudolph"
}{
"story": "Generated story content...",
"combined_characters": "Santa Claus and Rudolph",
"method": "LangChain",
"generation_time_ms": 1234.56,
"request_id": "uuid-here"
}All test scripts are now organized in the test/ directory. See test/README.md for detailed test documentation.
# Working Unit Tests (34/34 passing)
pytest tests/unit/test_config.py tests/unit/test_infrastructure.py tests/test_prompts.py -v
# MCP Tests (All frameworks working)
python test/test_mcp_client.py # Comprehensive MCP testing with object extraction
python test/test_mcp_working.py # Basic MCP functionality test (3 frameworks)
# Logging Tests (Full request tracking)
python test/test_enhanced_logging.py # Test enhanced logging system
# Rate Limiting Tests (All working)
python test/test_rate_limiting_simple.py # Simple rate limiting test (verified working)
# Full pytest suite (264 tests - some API route mismatches)
pytest # Run all tests
pytest tests/unit # Run only unit tests (all passing)- ✅ Unit Tests: 34/34 passing (config, infrastructure, prompts)
- ✅ MCP Tests: All 3 frameworks working with real API calls
- ✅ Logging Tests: Full request tracking and cost calculation
- ✅ Rate Limiting: Working (60 req/min per IP, endpoint-specific limits)
⚠️ API Tests: Some failures due to route mismatches (functionality works)
# Run working tests only
pytest tests/unit/test_config.py tests/unit/test_infrastructure.py tests/test_prompts.py
# Run all tests (some expected failures)
pytest
# Run with coverage
pytest --cov=. tests/unit/The platform supports a modern React frontend alongside the FastAPI backend:
- Modern Stack: React 18, TypeScript, Vite, Tailwind CSS
- Real-time Updates: React Query for efficient data fetching and caching
- Responsive Design: Mobile-first design principles
- Type Safety: Full TypeScript support with comprehensive type definitions
- Advanced UI: Form validation, loading states, toast notifications
- Performance: Code splitting, optimistic updates, and intelligent caching
# Development (separate terminals)
# Backend
python backend/main.py
# React Frontend
cd frontendReact && npm run dev
# Production with Docker
docker-compose -f docker-compose.separated.yml up --build- React Frontend: http://localhost:3001
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- MCP Server: http://localhost:8000/mcp
The application uses SQLAlchemy with Alembic for database migrations:
# Apply migrations
alembic upgrade head
# Create new migration
alembic revision --autogenerate -m "Description"
# Check migration status
alembic current- StoryDB: Stores generated stories with metadata
- ChatConversation: Manages chat conversations
- ChatMessage: Individual chat messages
- CostUsage: Tracks API usage and costs
- ContextFile: Uploaded context files
- ContextPromptExecution: Context execution history
# Check setup and dependencies
python check-setup.py
# Run development server with auto-reload
python backend/main.py
# Run all services with Docker
docker-compose -f docker-compose.separated.yml up --build
# View logs in real-time
python test_logging.py # Generate test logs
# Then visit http://localhost:8000/logsAI Story Generator Platform
├── Backend (FastAPI)
│ ├── Story Generation Services
│ │ ├── Semantic Kernel Service
│ │ ├── LangChain Service
│ │ └── LangGraph Service
│ ├── Chat System
│ ├── Context Management
│ ├── Cost Tracking
│ ├── MCP Server Integration
│ └── Database Layer (SQLAlchemy)
├── Frontend (React + TypeScript)
│ ├── Story Generator Interface
│ ├── Chat Interface
│ ├── Story History & Search
│ ├── Cost Tracking Dashboard
│ └── Context Management UI
├── MCP Integration
│ ├── Story Generation Tools
│ ├── Framework Comparison
│ └── Claude Desktop Compatible
└── Infrastructure
├── Docker Containers
├── Database (SQLite/PostgreSQL)
├── Logging System
└── Health Monitoring
The application includes comprehensive structured logging with JSON output:
- Request start/completion with duration and token usage
- API calls with retry attempts and response times
- Error tracking with stack traces and context
- Token usage tracking for cost monitoring
Logs are written to both console and rotating files:
- Console: Human-readable format in development
- Files: JSON format in
logs/app.log(configurable) - Rotation: Hourly rotation with configurable retention
- Format: Each log entry includes timestamp, level, service, request_id, and structured data
- DEBUG: Detailed debugging information
- INFO: General application events, API calls, token usage
- WARNING: Retry attempts, rate limiting
- ERROR: Application errors with full context
The application includes a comprehensive web-based log viewer accessible at /logs:
Features:
- Real-time log viewing with auto-refresh (30s intervals)
- Advanced filtering by log level, search terms, and file selection
- Pagination support (25/50/100/200 entries per page)
- Detailed log entry inspection with expandable JSON view
- Performance metrics display (response times, token usage, etc.)
- Security event monitoring (suspicious input detection, validation failures)
- Mobile-responsive design with Bootstrap 5
Usage:
- Navigate to
http://localhost:8000/logs - Select a log file from the dropdown
- Use filters to narrow down entries
- Click the eye icon to expand full log details
- Auto-refresh keeps logs current
Test Log Generation:
python3 test_logging.pyCustom exceptions with specific error codes:
ValidationError: Input validation failuresAPIKeyError: Missing or invalid API credentialsAPIConnectionError: Failed API connectionsAPIRateLimitError: Rate limit exceededTimeoutError: Operation timeouts
- API calls timeout after 60 seconds
- Maximum 3 retry attempts with exponential backoff
- Connection pooling limits: 10 connections, 5 keep-alive
- All user input is sanitized and validated
- Character names limited to 100 characters
- HTML/script injection prevention
- CORS configured for specific origins only
The platform includes a built-in MCP (Model Context Protocol) server that exposes story generation capabilities as tools:
generate_story_semantic_kernel: Generate stories using Semantic Kernelgenerate_story_langchain: Generate stories using LangChaingenerate_story_langgraph: Generate stories using LangGraph
Note: The compare_frameworks tool has been removed. See FRAMEWORK_COMPARISON.md for details about the removed functionality and alternative implementation approaches.
- Endpoint:
http://localhost:8000/mcp(primary) orhttp://localhost:9999/mcp(fallback) - Protocol: HTTP-based MCP implementation using FastMCP
- Integration: Mounted as ASGI sub-application via
http_app()method - Auto-start: Integrated with main application
- Logging: Full structured logging with console output
{
"mcpServers": {
"ai-story-generator": {
"command": "http",
"args": ["http://localhost:8000/mcp"]
}
}
}- Pros: Enterprise support, SLA guarantees, data privacy
- Cons: Requires Azure account, region-specific
- Best for: Enterprise deployments, regulated industries
- Pros: Access to multiple models, easy setup, pay-per-use
- Cons: Third-party service, usage-based pricing
- Best for: Development, testing, multi-model comparison
- Pros: Use any OpenAI-compatible API, self-hosted options, specialized models
- Cons: Requires manual configuration, limited to OpenAI-compatible APIs
- Best for: Custom deployments, specialized models, local LLMs
To switch to a different provider:
- Copy the OpenRouter example configuration:
cp .env.openrouter.example .env- Add your OpenRouter API key (get one at https://openrouter.ai/keys)
- Choose your preferred model (see https://openrouter.ai/models)
- Restart the application
- Copy the custom provider example configuration:
cp .env.custom.example .env- Configure your provider details:
- Set
CUSTOM_API_KEYto your API key - Set
CUSTOM_API_BASE_URLto your provider's endpoint - Set
CUSTOM_MODELto your desired model - Set
CUSTOM_PROVIDER_NAMEto display name (e.g., "Tachyon LLM")
- Set
- Restart the application
The application will automatically use the configured provider without any code changes.
Any OpenAI-compatible API should work, including:
- Tachyon LLM - High-performance LLM service
- Ollama - Run LLMs locally
- LM Studio - Local LLM server
- vLLM - High-throughput LLM serving
- FastChat - Multi-model serving
- LocalAI - OpenAI compatible local API
- Text Generation Inference - Hugging Face's LLM server
The application uses a modular prompt system where prompts are stored in separate .txt files for easy editing:
prompts/
├── langchain/
│ ├── langchain_system_prompt.txt # LangChain system prompt
│ └── langchain_user_prompt_template.txt # LangChain user template
├── langgraph/
│ ├── langgraph_storyteller_system_prompt.txt # LangGraph storyteller prompt
│ ├── langgraph_initial_story_template.txt # LangGraph initial story template
│ ├── langgraph_editor_system_prompt.txt # LangGraph editor prompt
│ └── langgraph_enhancement_template.txt # LangGraph enhancement template
└── semantic_kernel/
├── semantic_kernel_system_prompt.txt # Semantic Kernel system prompt
└── semantic_kernel_user_message_template.txt # Semantic Kernel user template
To customize the story generation behavior:
- Navigate to the appropriate subdirectory in
prompts/(langchain, langgraph, or semantic_kernel) - Edit the relevant
.txtfile for the framework you want to customize - Maintain the template variables like
{primary_character}and{secondary_character} - Restart the application to load the new prompts
All user prompt templates support these variables:
{primary_character}- The first character name{secondary_character}- The second character name{story}(LangGraph only) - The initial story for enhancement