ShortGen

🎬 Automated Short-Form Video Generation Platform

An intelligent content creation system that automatically generates engaging short-form videos from Reddit stories, complete with AI-powered voiceovers, captions, and professional editing.

📋 Table of Contents

Overview
Features
Architecture
Installation
Configuration
Usage
Project Structure
API Integrations
Contributing
License

🎯 Overview

ShortGen is a comprehensive automation platform designed to streamline the creation of short-form video content for platforms like YouTube Shorts, TikTok, and Instagram Reels. The system intelligently sources content from Reddit, processes it through AI-powered workflows, and produces professionally edited videos ready for upload.

Key Capabilities

Automated Content Sourcing: Fetches and processes Reddit stories and threads
AI Script Generation: Uses OpenAI GPT models to create engaging video scripts
High-Quality Voice Synthesis: Supports both ElevenLabs (premium) and Edge TTS (free)
Professional Video Editing: Automated editing pipeline with captions, background videos, and music
Multi-Channel Management: Support for managing multiple YouTube channels simultaneously
Scheduled Publishing: Built-in scheduler for automated video posting
GUI Interface: User-friendly desktop application for easy management

✨ Features

Content Creation

🤖 AI-Powered Script Generation: Leverages GPT models to transform Reddit content into engaging narratives
🎙️ Dual TTS Systems:
- ElevenLabs for premium, natural-sounding voices
- Microsoft Edge TTS for free, high-quality synthesis
🌍 Multi-Language Support: Generate content in multiple languages with proper voice mapping
📝 Automatic Captioning: Synchronized captions with customizable styles

Video Production

🎬 Modular Editing Framework: JSON-based editing pipeline for customizable workflows
🎨 Visual Enhancement:
- Background video integration
- Reddit screenshot overlays
- Custom watermarks and branding
- Subscribe animations
🎵 Audio Processing:
- Background music integration
- Voice-over synchronization
- Audio level balancing

Automation & Management

📅 Smart Scheduling: Configure posting schedules for multiple channels
📊 Video Tracking: Database-driven system to track generated and posted videos
🔄 Resume Capability: Continue interrupted video generation from last checkpoint
📤 YouTube Integration: Direct upload to YouTube with metadata generation

User Interface

🖥️ Desktop GUI: Full-featured tkinter application with:
- Real-time logging and progress monitoring
- Channel-specific configuration
- Schedule management
- System tray integration
- Windows toast notifications

🏗️ Architecture

ShortGen follows a modular architecture with clearly separated concerns:

┌─────────────────────────────────────────────────┐
│              User Interface Layer               │
│  (GUI Application, CLI Tools, Schedulers)      │
└─────────────────┬───────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────┐
│            Content Engine Layer                 │
│  • Reddit Content Fetcher                       │
│  • GPT Script Generator                         │
│  • Voice Module Orchestrator                    │
└─────────────────┬───────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────┐
│          Editing Framework Layer                │
│  • Core Editing Engine                          │
│  • Step-based Processing Pipeline               │
│  • Asset Management System                      │
└─────────────────┬───────────────────────────────┘
                  │
┌─────────────────▼───────────────────────────────┐
│           Output & Distribution                 │
│  • Video Renderer                               │
│  • YouTube Publisher                            │
│  • Storage Manager                              │
└─────────────────────────────────────────────────┘

Core Components

Reddit Short Engine (reddit_short_engine.py)
- Orchestrates the entire video generation pipeline
- Manages state persistence and checkpoint recovery
- Handles asset preparation and video rendering
Editing Framework (editing_framework/)
- JSON-driven editing steps and flows
- Modular architecture for easy customization
- Support for complex editing operations
Voice Modules (audio/)
- Abstract voice interface for multiple TTS providers
- ElevenLabs and Edge TTS implementations
- Voice selection based on language and gender
GPT Utilities (gpt/)
- OpenAI API integration
- Prompt template system
- Content generation and filtering

🚀 Installation

Prerequisites

Python: Version 3.8 or higher
FFmpeg: Required for video processing
ImageMagick: Required for caption rendering
Operating System: Windows (tested), Linux/macOS (should work with minor adjustments)

Step 1: Clone the Repository

git clone https://github.com/yourusername/ShortGen.git
cd ShortGen

Step 2: Create Virtual Environment

python -m venv venv

# Windows
venv\Scripts\activate

# Linux/macOS
source venv/bin/activate

Step 3: Install Dependencies

# Core dependencies
pip install -r requirements.txt

# GUI dependencies (if using the desktop app)
pip install -r gui_requirements.txt

Step 4: Install FFmpeg and ImageMagick

Windows:

# Using Chocolatey
choco install ffmpeg imagemagick

# Or download installers from official websites

Linux:

sudo apt-get update
sudo apt-get install ffmpeg imagemagick

macOS:

brew install ffmpeg imagemagick

⚙️ Configuration

1. Environment Variables

Create a .env file in the root directory:

# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here

# ElevenLabs Configuration (optional, for premium voices)
ELEVEN_LABS_API_KEY=your_elevenlabs_api_key_here

# PlayHT Configuration (optional)
PLAY_HT_USERID=your_playht_userid_here
PLAY_HT_API_KEY=your_playht_api_key_here

2. YouTube API Setup

For automated YouTube uploads:

Create a Google Cloud Project
- Visit Google Cloud Console
- Create a new project
Enable YouTube Data API v3
- Navigate to "APIs & Services" > "Library"
- Search for "YouTube Data API v3"
- Enable the API
Create OAuth 2.0 Credentials
- Go to "APIs & Services" > "Credentials"
- Create OAuth 2.0 Client ID (Desktop application)
- Download the credentials JSON file
Configure ShortGen
- Rename the downloaded file to client_secrets.json
- Place it in the ShortGen root directory
- On first upload, you'll be prompted to authorize the application

3. Background Assets

Place your assets in the public/ directory:

public/
├── background_videos/     # Background footage for shorts
├── background_music/      # Royalty-free music tracks
├── fonts/                 # Custom fonts for captions
└── watermarks/           # Channel watermarks/logos

4. Configuration Files

public.yaml: Asset configuration and paths
schedule.json: Posting schedule configuration
databse.json: Video generation state database
abbreviations.json: Text abbreviation mappings

📖 Usage

GUI Application (Recommended)

Launch the desktop application:

# Windows
start_gui.bat

# Or directly
python gui_app.py

Features:

Configure channels and API keys
Monitor video generation in real-time
Manage posting schedules
View logs and system status
System tray integration for background operation

Command Line Interface

Generate a Single Video

python run.py

The CLI will prompt you for:

Channel number
Reddit link
Language selection
Voice provider (ElevenLabs or Edge TTS)

Automated Scheduled Generation

python launch.py

This starts the scheduler which:

Runs according to schedule.json configuration
Generates videos for configured channels
Automatically posts to YouTube
Logs all operations

Programmatic Usage

from ShortGen.engine.reddit_short_engine import RedditShortEngine
from ShortGen.audio.edge_voice_module import EdgeTTSVoiceModule
from ShortGen.config.languages import Language

# Initialize voice module
voice_module = EdgeTTSVoiceModule(voice_name="en-US-AriaNeural")

# Create engine instance
engine = RedditShortEngine(
    voiceModule=voice_module,
    background_video_name="minecraft_parkour.mp4",
    background_music_name="lofi_beat.mp3",
    reddit_link="https://reddit.com/r/AskReddit/comments/xxxxx",
    language=Language.ENGLISH
)

# Generate video
engine.generate_short()

📁 Project Structure

ShortGen/
├── ShortGen/                      # Main package
│   ├── api_utils/                 # External API integrations
│   │   └── image_api.py          # Image generation APIs
│   ├── audio/                     # Voice synthesis modules
│   │   ├── audio_utils.py        # Audio processing utilities
│   │   ├── edge_voice_module.py  # Microsoft Edge TTS
│   │   ├── elevenlabs_voice_module.py  # ElevenLabs TTS
│   │   └── voice_module.py       # Abstract voice interface
│   ├── config/                    # Configuration management
│   │   ├── api_key_manager.py    # API key handling
│   │   ├── config.py             # Global configuration
│   │   ├── languages.py          # Language and voice mappings
│   │   └── path_utils.py         # Path resolution utilities
│   ├── editing_framework/         # Video editing system
│   │   ├── core_editing_engine.py  # Core editing logic
│   │   ├── editing_engine.py     # High-level editing API
│   │   ├── editing_steps/        # JSON editing step definitions
│   │   └── flows/                # Predefined editing workflows
│   ├── editing_utils/             # Editing helper functions
│   │   ├── captions.py           # Caption generation and styling
│   │   ├── editing_images.py    # Image processing
│   │   └── handle_videos.py     # Video manipulation
│   ├── engine/                    # Core generation engines
│   │   └── reddit_short_engine.py  # Reddit video generator
│   ├── gpt/                       # OpenAI GPT integrations
│   │   ├── gpt_utils.py          # GPT API utilities
│   │   ├── gpt_voice.py          # Voice selection AI
│   │   └── gpt_yt.py             # YouTube metadata generation
│   ├── prompt_templates/          # GPT prompt templates (YAML)
│   └── reddit_content/            # Reddit API integration
│       └── reddit_story_api.py   # Reddit content fetcher
├── tools/                         # Utility scripts
│   └── download_and_crop.py      # Video preprocessing tool
├── public/                        # Static assets
│   ├── background_videos/
│   ├── background_music/
│   └── fonts/
├── gui_app.py                     # Desktop GUI application
├── run.py                         # CLI video generator
├── launch.py                      # Scheduler daemon
├── poster.py                      # YouTube upload utility
├── selector_engine.py             # Video selection logic
├── requirements.txt               # Python dependencies
├── gui_requirements.txt           # GUI-specific dependencies
└── README.md                      # This file

🔌 API Integrations

OpenAI GPT

Used for:

Script generation from Reddit content
Content filtering and quality control
YouTube title and description generation
Voice gender identification
Language translation

Models supported: GPT-4, GPT-4-turbo, GPT-3.5-turbo

ElevenLabs

Premium text-to-speech service offering:

Natural, human-like voices
Multiple language support
Voice cloning capabilities
High-quality audio output

Required for: Premium voice quality (optional)

Microsoft Edge TTS

Free text-to-speech alternative:

High-quality voices at no cost
Wide language coverage
Multiple voice options per language
Good for testing and development

Required for: Free voice synthesis

Reddit API

Content sourcing through:

PRAW (Python Reddit API Wrapper)
Direct HTML scraping for specific threads
Story extraction and formatting

Required for: Content generation

YouTube Data API v3

Automated video publishing:

OAuth2 authentication
Video upload with metadata
Multi-channel support
Shorts-specific categorization

Required for: Automatic posting

🎨 Customization

Adding Custom Editing Steps

Create a new JSON file in editing_steps/:

{
  "step_name": "custom_effect",
  "description": "Apply custom effect",
  "operations": [
    {
      "type": "filter",
      "parameters": {
        "effect": "blur",
        "intensity": 0.5
      }
    }
  ]
}

Creating Custom Workflows

Define a new flow in flows/:

{
  "flow_name": "custom_reddit_flow",
  "steps": [
    "add_background_video",
    "add_voiceover",
    "make_caption",
    "custom_effect",
    "show_watermark"
  ]
}

Adding New Languages

Edit languages.py:

class Language(Enum):
    SPANISH = "Spanish"
    # Add your language

EDGE_TTS_VOICENAME_MAPPING = {
    Language.SPANISH: {
        "male": "es-ES-AlvaroNeural",
        "female": "es-ES-ElviraNeural"
    }
}

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

OpenAI - GPT models for content generation
ElevenLabs - Premium voice synthesis
Microsoft - Edge TTS free voices
MoviePy - Video editing framework
Reddit - Content source platform

⚠️ Disclaimer

This tool is provided for educational and automation purposes. Users are responsible for:

Ensuring compliance with Reddit's Terms of Service
Respecting copyright and intellectual property rights
Following YouTube's Community Guidelines and Terms of Service
Obtaining necessary permissions for background music and videos
Using AI-generated content ethically and responsibly

The authors and contributors are not responsible for any misuse of this software.

Made with ❤️ by the Alexandru Grecu

⭐ Star this repo if you find it useful! ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ShortGen		ShortGen
fonts		fonts
public		public
tools		tools
.gitignore		.gitignore
README.md		README.md
abbreviations.json		abbreviations.json
databse.json		databse.json
gui_app.py		gui_app.py
gui_requirements.txt		gui_requirements.txt
launch.py		launch.py
links.txt		links.txt
posted_videos.json		posted_videos.json
poster.py		poster.py
requirements.txt		requirements.txt
run.py		run.py
save		save
schedule.json		schedule.json
selector_engine.py		selector_engine.py
start_gui.bat		start_gui.bat

Folders and files

Latest commit

History

Repository files navigation