🎬 Automated Short-Form Video Generation Platform
An intelligent content creation system that automatically generates engaging short-form videos from Reddit stories, complete with AI-powered voiceovers, captions, and professional editing.
- Overview
- Features
- Architecture
- Installation
- Configuration
- Usage
- Project Structure
- API Integrations
- Contributing
- License
ShortGen is a comprehensive automation platform designed to streamline the creation of short-form video content for platforms like YouTube Shorts, TikTok, and Instagram Reels. The system intelligently sources content from Reddit, processes it through AI-powered workflows, and produces professionally edited videos ready for upload.
- Automated Content Sourcing: Fetches and processes Reddit stories and threads
- AI Script Generation: Uses OpenAI GPT models to create engaging video scripts
- High-Quality Voice Synthesis: Supports both ElevenLabs (premium) and Edge TTS (free)
- Professional Video Editing: Automated editing pipeline with captions, background videos, and music
- Multi-Channel Management: Support for managing multiple YouTube channels simultaneously
- Scheduled Publishing: Built-in scheduler for automated video posting
- GUI Interface: User-friendly desktop application for easy management
- 🤖 AI-Powered Script Generation: Leverages GPT models to transform Reddit content into engaging narratives
- 🎙️ Dual TTS Systems:
- ElevenLabs for premium, natural-sounding voices
- Microsoft Edge TTS for free, high-quality synthesis
- 🌍 Multi-Language Support: Generate content in multiple languages with proper voice mapping
- 📝 Automatic Captioning: Synchronized captions with customizable styles
- 🎬 Modular Editing Framework: JSON-based editing pipeline for customizable workflows
- 🎨 Visual Enhancement:
- Background video integration
- Reddit screenshot overlays
- Custom watermarks and branding
- Subscribe animations
- 🎵 Audio Processing:
- Background music integration
- Voice-over synchronization
- Audio level balancing
- 📅 Smart Scheduling: Configure posting schedules for multiple channels
- 📊 Video Tracking: Database-driven system to track generated and posted videos
- 🔄 Resume Capability: Continue interrupted video generation from last checkpoint
- 📤 YouTube Integration: Direct upload to YouTube with metadata generation
- 🖥️ Desktop GUI: Full-featured tkinter application with:
- Real-time logging and progress monitoring
- Channel-specific configuration
- Schedule management
- System tray integration
- Windows toast notifications
ShortGen follows a modular architecture with clearly separated concerns:
┌─────────────────────────────────────────────────┐
│ User Interface Layer │
│ (GUI Application, CLI Tools, Schedulers) │
└─────────────────┬───────────────────────────────┘
│
┌─────────────────▼───────────────────────────────┐
│ Content Engine Layer │
│ • Reddit Content Fetcher │
│ • GPT Script Generator │
│ • Voice Module Orchestrator │
└─────────────────┬───────────────────────────────┘
│
┌─────────────────▼───────────────────────────────┐
│ Editing Framework Layer │
│ • Core Editing Engine │
│ • Step-based Processing Pipeline │
│ • Asset Management System │
└─────────────────┬───────────────────────────────┘
│
┌─────────────────▼───────────────────────────────┐
│ Output & Distribution │
│ • Video Renderer │
│ • YouTube Publisher │
│ • Storage Manager │
└─────────────────────────────────────────────────┘
-
Reddit Short Engine (reddit_short_engine.py)
- Orchestrates the entire video generation pipeline
- Manages state persistence and checkpoint recovery
- Handles asset preparation and video rendering
-
Editing Framework (editing_framework/)
- JSON-driven editing steps and flows
- Modular architecture for easy customization
- Support for complex editing operations
-
Voice Modules (audio/)
- Abstract voice interface for multiple TTS providers
- ElevenLabs and Edge TTS implementations
- Voice selection based on language and gender
-
GPT Utilities (gpt/)
- OpenAI API integration
- Prompt template system
- Content generation and filtering
- Python: Version 3.8 or higher
- FFmpeg: Required for video processing
- ImageMagick: Required for caption rendering
- Operating System: Windows (tested), Linux/macOS (should work with minor adjustments)
git clone https://github.com/yourusername/ShortGen.git
cd ShortGenpython -m venv venv
# Windows
venv\Scripts\activate
# Linux/macOS
source venv/bin/activate# Core dependencies
pip install -r requirements.txt
# GUI dependencies (if using the desktop app)
pip install -r gui_requirements.txtWindows:
# Using Chocolatey
choco install ffmpeg imagemagick
# Or download installers from official websitesLinux:
sudo apt-get update
sudo apt-get install ffmpeg imagemagickmacOS:
brew install ffmpeg imagemagickCreate a .env file in the root directory:
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here
# ElevenLabs Configuration (optional, for premium voices)
ELEVEN_LABS_API_KEY=your_elevenlabs_api_key_here
# PlayHT Configuration (optional)
PLAY_HT_USERID=your_playht_userid_here
PLAY_HT_API_KEY=your_playht_api_key_hereFor automated YouTube uploads:
-
Create a Google Cloud Project
- Visit Google Cloud Console
- Create a new project
-
Enable YouTube Data API v3
- Navigate to "APIs & Services" > "Library"
- Search for "YouTube Data API v3"
- Enable the API
-
Create OAuth 2.0 Credentials
- Go to "APIs & Services" > "Credentials"
- Create OAuth 2.0 Client ID (Desktop application)
- Download the credentials JSON file
-
Configure ShortGen
- Rename the downloaded file to
client_secrets.json - Place it in the ShortGen root directory
- On first upload, you'll be prompted to authorize the application
- Rename the downloaded file to
Place your assets in the public/ directory:
public/
├── background_videos/ # Background footage for shorts
├── background_music/ # Royalty-free music tracks
├── fonts/ # Custom fonts for captions
└── watermarks/ # Channel watermarks/logos
- public.yaml: Asset configuration and paths
- schedule.json: Posting schedule configuration
- databse.json: Video generation state database
- abbreviations.json: Text abbreviation mappings
Launch the desktop application:
# Windows
start_gui.bat
# Or directly
python gui_app.pyFeatures:
- Configure channels and API keys
- Monitor video generation in real-time
- Manage posting schedules
- View logs and system status
- System tray integration for background operation
python run.pyThe CLI will prompt you for:
- Channel number
- Reddit link
- Language selection
- Voice provider (ElevenLabs or Edge TTS)
python launch.pyThis starts the scheduler which:
- Runs according to
schedule.jsonconfiguration - Generates videos for configured channels
- Automatically posts to YouTube
- Logs all operations
from ShortGen.engine.reddit_short_engine import RedditShortEngine
from ShortGen.audio.edge_voice_module import EdgeTTSVoiceModule
from ShortGen.config.languages import Language
# Initialize voice module
voice_module = EdgeTTSVoiceModule(voice_name="en-US-AriaNeural")
# Create engine instance
engine = RedditShortEngine(
voiceModule=voice_module,
background_video_name="minecraft_parkour.mp4",
background_music_name="lofi_beat.mp3",
reddit_link="https://reddit.com/r/AskReddit/comments/xxxxx",
language=Language.ENGLISH
)
# Generate video
engine.generate_short()ShortGen/
├── ShortGen/ # Main package
│ ├── api_utils/ # External API integrations
│ │ └── image_api.py # Image generation APIs
│ ├── audio/ # Voice synthesis modules
│ │ ├── audio_utils.py # Audio processing utilities
│ │ ├── edge_voice_module.py # Microsoft Edge TTS
│ │ ├── elevenlabs_voice_module.py # ElevenLabs TTS
│ │ └── voice_module.py # Abstract voice interface
│ ├── config/ # Configuration management
│ │ ├── api_key_manager.py # API key handling
│ │ ├── config.py # Global configuration
│ │ ├── languages.py # Language and voice mappings
│ │ └── path_utils.py # Path resolution utilities
│ ├── editing_framework/ # Video editing system
│ │ ├── core_editing_engine.py # Core editing logic
│ │ ├── editing_engine.py # High-level editing API
│ │ ├── editing_steps/ # JSON editing step definitions
│ │ └── flows/ # Predefined editing workflows
│ ├── editing_utils/ # Editing helper functions
│ │ ├── captions.py # Caption generation and styling
│ │ ├── editing_images.py # Image processing
│ │ └── handle_videos.py # Video manipulation
│ ├── engine/ # Core generation engines
│ │ └── reddit_short_engine.py # Reddit video generator
│ ├── gpt/ # OpenAI GPT integrations
│ │ ├── gpt_utils.py # GPT API utilities
│ │ ├── gpt_voice.py # Voice selection AI
│ │ └── gpt_yt.py # YouTube metadata generation
│ ├── prompt_templates/ # GPT prompt templates (YAML)
│ └── reddit_content/ # Reddit API integration
│ └── reddit_story_api.py # Reddit content fetcher
├── tools/ # Utility scripts
│ └── download_and_crop.py # Video preprocessing tool
├── public/ # Static assets
│ ├── background_videos/
│ ├── background_music/
│ └── fonts/
├── gui_app.py # Desktop GUI application
├── run.py # CLI video generator
├── launch.py # Scheduler daemon
├── poster.py # YouTube upload utility
├── selector_engine.py # Video selection logic
├── requirements.txt # Python dependencies
├── gui_requirements.txt # GUI-specific dependencies
└── README.md # This file
Used for:
- Script generation from Reddit content
- Content filtering and quality control
- YouTube title and description generation
- Voice gender identification
- Language translation
Models supported: GPT-4, GPT-4-turbo, GPT-3.5-turbo
Premium text-to-speech service offering:
- Natural, human-like voices
- Multiple language support
- Voice cloning capabilities
- High-quality audio output
Required for: Premium voice quality (optional)
Free text-to-speech alternative:
- High-quality voices at no cost
- Wide language coverage
- Multiple voice options per language
- Good for testing and development
Required for: Free voice synthesis
Content sourcing through:
- PRAW (Python Reddit API Wrapper)
- Direct HTML scraping for specific threads
- Story extraction and formatting
Required for: Content generation
Automated video publishing:
- OAuth2 authentication
- Video upload with metadata
- Multi-channel support
- Shorts-specific categorization
Required for: Automatic posting
Create a new JSON file in editing_steps/:
{
"step_name": "custom_effect",
"description": "Apply custom effect",
"operations": [
{
"type": "filter",
"parameters": {
"effect": "blur",
"intensity": 0.5
}
}
]
}Define a new flow in flows/:
{
"flow_name": "custom_reddit_flow",
"steps": [
"add_background_video",
"add_voiceover",
"make_caption",
"custom_effect",
"show_watermark"
]
}Edit languages.py:
class Language(Enum):
SPANISH = "Spanish"
# Add your language
EDGE_TTS_VOICENAME_MAPPING = {
Language.SPANISH: {
"male": "es-ES-AlvaroNeural",
"female": "es-ES-ElviraNeural"
}
}This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI - GPT models for content generation
- ElevenLabs - Premium voice synthesis
- Microsoft - Edge TTS free voices
- MoviePy - Video editing framework
- Reddit - Content source platform
This tool is provided for educational and automation purposes. Users are responsible for:
- Ensuring compliance with Reddit's Terms of Service
- Respecting copyright and intellectual property rights
- Following YouTube's Community Guidelines and Terms of Service
- Obtaining necessary permissions for background music and videos
- Using AI-generated content ethically and responsibly
The authors and contributors are not responsible for any misuse of this software.
Made with ❤️ by the Alexandru Grecu
⭐ Star this repo if you find it useful! ⭐