Skip to content

A locally-powered text-to-video desktop app for Ubuntu. Describe a scene in natural language and generate videos using Stable Diffusion — no cloud APIs, no subscriptions. Built with Electron, React, FastAPI, and PyTorch with CUDA acceleration and 4GB VRAM support.

License

Notifications You must be signed in to change notification settings

Bhavyyadav25/VideoAI

Repository files navigation

CinematicAI

A beautiful, locally-powered AI video creation app for Ubuntu. Create stunning videos from text prompts using a chat-first interface with Stable Diffusion.

License Platform Node Python

Features

  • Chat Interface - Describe your vision in natural language and watch it come to life
  • Text-to-Video - Generate videos from text prompts using Stable Diffusion
  • Local Processing - All AI runs locally on your machine, no cloud APIs needed
  • GPU Accelerated - NVIDIA CUDA support for fast generation
  • CPU Fallback - Works on systems without dedicated GPU (slower)
  • Low VRAM Optimized - Runs on GPUs with as little as 4GB VRAM
  • Real-time Progress - WebSocket-based progress updates during generation
  • Dark Cinematic Theme - Beautiful, modern UI designed for creators

Screenshots

Coming soon

System Requirements

Minimum

  • Ubuntu 22.04 or later
  • 16GB RAM
  • 4GB VRAM GPU (NVIDIA) or CPU-only mode
  • 25GB free disk space (for models)
  • Node.js 18+
  • Python 3.10+

Recommended

  • Ubuntu 22.04 or later
  • 32GB RAM
  • 8GB+ VRAM NVIDIA GPU
  • 50GB free disk space
  • Node.js 20+
  • Python 3.11+

Installation

1. Install System Dependencies

# Update package list
sudo apt update

# Install Node.js 20.x
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs

# Install Python and pip
sudo apt install -y python3 python3-pip python3-venv

# Install FFmpeg (for video encoding)
sudo apt install -y ffmpeg

# Install build tools
sudo apt install -y build-essential

2. Install NVIDIA Drivers (Optional, for GPU acceleration)

# Check if NVIDIA driver is installed
nvidia-smi

# If not installed, install the recommended driver
sudo ubuntu-drivers autoinstall
sudo reboot

3. Clone the Repository

git clone https://github.com/Bhavyyadav25/VideoAI.git
cd VideoAI

4. Install Frontend Dependencies

npm install

5. Set Up Python Backend

# Navigate to backend directory
cd backend

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

# Install PyTorch with CUDA support (for NVIDIA GPUs)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124

# Or install CPU-only version (if no NVIDIA GPU)
# pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

# Install other dependencies
pip install -r requirements.txt

# Return to project root
cd ..

6. First Run (Downloads AI Models)

The first time you run the app, it will download the Stable Diffusion model (~4GB). This only happens once.

# Start the application
npm run dev

Usage

Starting the App

# From the project root directory
npm run dev

This starts both the Electron app and the Python backend.

Generating Videos

  1. Open the app - it will show the Chat interface
  2. Type a description of the video you want to create
    • Example: "A sunset over calm ocean waves"
    • Example: "A cozy coffee shop on a rainy evening, warm lighting"
  3. Press Enter or click the send button
  4. Watch the progress bar as your video is generated
  5. Once complete, the video will appear in the chat

Manual Backend Start (Development)

If you need to run the backend separately:

cd backend
source venv/bin/activate
python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Configuration

Video Generation Parameters

When generating videos, you can adjust these parameters in the Settings page:

Parameter Default Description
Width 512 Video width in pixels
Height 512 Video height in pixels
Frames 16 Number of frames
Steps 25 Inference steps (higher = better quality, slower)
Guidance 7.5 How closely to follow the prompt

GPU Memory Optimization

The app automatically adjusts settings based on available VRAM:

  • < 4GB VRAM: Sequential CPU offload, 384x384 max resolution
  • 4-6GB VRAM: Model CPU offload, 512x512 max resolution
  • 6GB+ VRAM: Full GPU acceleration

Project Structure

VideoAI/
├── electron/           # Electron main process
│   ├── main.ts        # App entry point
│   └── preload.ts     # IPC bridge
├── src/               # React frontend
│   ├── components/    # UI components
│   ├── pages/         # Page components
│   ├── services/      # API & WebSocket
│   └── stores/        # Zustand state
├── backend/           # Python AI backend
│   ├── api/           # FastAPI routes
│   ├── models/        # AI model wrappers
│   └── main.py        # Server entry
└── output/            # Generated videos

Troubleshooting

"CUDA not available" warning

This means the app is running in CPU mode. To fix:

  1. Ensure NVIDIA drivers are installed: nvidia-smi
  2. Restart your system after driver installation
  3. Reinstall PyTorch with CUDA:
    cd backend
    source venv/bin/activate
    pip uninstall torch torchvision -y
    pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124

"Backend not connected" error

  1. Check if the backend is running:
    curl http://localhost:8000/api/health
  2. Start the backend manually:
    cd backend
    source venv/bin/activate
    python -m uvicorn main:app --host 0.0.0.0 --port 8000

Out of Memory errors

  • Reduce resolution in Settings (try 384x384)
  • Reduce inference steps (try 15-20)
  • Close other GPU-intensive applications
  • The app will automatically use CPU offloading for low VRAM

Slow generation on CPU

CPU-only generation is significantly slower (5-10x). For better performance:

  • Install an NVIDIA GPU with 4GB+ VRAM
  • Or use a cloud GPU instance

Model download fails

If the Stable Diffusion model fails to download:

  1. Check your internet connection
  2. Try downloading manually:
    cd backend
    source venv/bin/activate
    python -c "from diffusers import StableDiffusionPipeline; StableDiffusionPipeline.from_pretrained('runwayml/stable-diffusion-v1-5')"

Building for Production

# Build the Electron app
npm run build

# Create distributable packages
npm run build:linux

The built application will be in the release/ directory.

Tech Stack

Frontend

  • Electron 28
  • React 18
  • TypeScript
  • Tailwind CSS
  • Zustand (state management)
  • Framer Motion (animations)

Backend

  • FastAPI
  • PyTorch
  • Diffusers (Hugging Face)
  • Stable Diffusion v1.5

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

About

A locally-powered text-to-video desktop app for Ubuntu. Describe a scene in natural language and generate videos using Stable Diffusion — no cloud APIs, no subscriptions. Built with Electron, React, FastAPI, and PyTorch with CUDA acceleration and 4GB VRAM support.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •