AI ad production for DTC agencies.
Paste a product URL. Ship a stack of ad variants. One subscription, unlimited client brands.
What It Does · For Agencies · Run It Yourself · Models · Architecture · Contributing
Your agency runs 10+ Shopify clients. Every new account needs 30–50 ad variants in the first 60 days. UGC creators charge $150–500 per video. With a coordinator and a roster of creators you still ship 5–8 videos per client per week. Meta's algorithm wants 20+.
CutAgent replaces the UGC production pipeline with an ad-creative production layer. Paste a product URL, get a 4-scene storyboard, generate a full stack of variants — each scene routed to the frontier model that fits the shot.
Product URL ──→ 4-scene storyboard ──→ Hook (Kling) → Reveal (Veo) → Proof (MiniMax) → CTA (Seedance)
↑ auto-generated ↑ per-scene model routing
↑ voiceover scripts ↑ Style Engine chains visual continuity across shots
↑ brand kit per client ↑ Hook Lab generates 8 hook variations in one click
CutAgent has two products in one repo:
Hosted (cutagent.com) — the agency SaaS. Google sign-in, Stripe billing, managed fal.ai credits, cloud project storage, per-client brand kits, white-label export, Hook Lab, AI avatar scenes, auto-captions. This is what agencies buy.
Open-source (this repo) — the full editor as a Next.js app you can run locally with your own fal.ai key. No auth, no billing, no cloud save. MIT licensed. Fork it, embed it, ship it.
If you're running an agency with 10+ e-commerce clients and want to see the hosted product in action, book a 15-minute demo and we'll generate ads for one of your actual client brands on the call. Pricing starts at $199/mo for Agency Starter, $499/mo for Agency Pro with unlimited clients and white-label export.
See business/LANDING_AGENCY_VARIANT.md for the full agency pitch.
git clone https://github.com/rishidandu/cutagent.git
cd cutagent
npm install
npm run devOpen localhost:3000, paste your fal.ai API key in the editor (free tier available), and generate your first ad.
The open-source path runs without auth, Stripe, or Supabase. Just a fal.ai key. If you want to self-host the full hosted feature set (auth + cloud saves + billing), see business/SELF_HOSTING.md (coming soon).
Drop any Shopify, Amazon, Etsy, or eBay link. CutAgent scrapes the product data via the Shopify JSON API (with HTML scraper fallback), normalizes images (bypasses CDN restrictions, validates 300×300 minimum, upscales thumbnails, uploads to fal.ai storage), and builds a 4-scene storyboard automatically.
Not all models are equal. Kling excels at motion. Veo nails realism. Seedance handles native audio. CutAgent routes each scene to the model that fits its purpose:
| Scene Role | What It Does | Default Model | Why |
|---|---|---|---|
| Hook | Pattern interrupt, grabs attention in 3s | Kling 2.5 Turbo | Fast motion, dynamic angles |
| Reveal | Product showcase, the "aha" moment | Kling / Veo 2 | Clean product shots, realism |
| Proof | Social proof, testimonial feel | MiniMax Live | Authentic, fast |
| CTA | Hero shot, end-frame for text overlay | Seedance 1.5 | Cinematic, audio support |
One click, 8 distinct hook variants per storyboard. Different angles, emotional triggers, and pacing. Ship all 8, let the ad account sort it out. This is the feature the media buyers ask for, not the clips.
Splice cinematic b-roll with AI spokesperson scenes. Emotion-tagged TTS auto-fits voiceover to scene length. Captions burn in on export with server-side FFmpeg drawtext rendering.
Different models = different visual styles. The Style Engine solves this:
Scene 1 completes → extract last frame → pass as img2vid reference to Scene 2
↓
Scene 2 completes → extract last frame → pass to Scene 3
↓
...visual continuity across models
Plus a style brief prepended to every prompt (lighting, color palette, product description) and product reference images passed to every generation.
Each scene gets a voiceover script auto-written from product data with a hard word budget tied to duration. A 5-second scene gets max 11 words. TTS auto-speeds up if text is borderline too long. Never overflows.
FFmpeg.wasm stitches scenes + voiceover + music into a single MP4 in-browser. Optional server-side caption burn-in via /api/add-captions for export with subtitles. Brand kit logo overlay (hosted only; OSS version shipping soon).
- 8 AI video models via fal.ai (Kling 2.5, Veo 2/3, Seedance 1.5, Luma Ray 2, MiniMax Live, Wan 2.5, HunyuanVideo)
- AI Avatar scenes via avatar-specific model adapters (TTS → avatar endpoint)
- Role-based prompt engine with anti-hallucination rules (negative prompts to prevent invented text/logos)
- Model-specific adapters for correct parameter formatting per model
- Image normalization — server-side download, dimension validation, CDN upscaling, fal storage upload
- Job recovery — resume incomplete generations after page reload
- Per-client brand kits — logo, colors, tagline, watermark position
- Cloud project storage — Supabase Storage for permanent video URLs + project JSON
- Auto-save sidebar — ChatGPT-style project list with auto-naming and switching
- Stripe billing — 3-tier subscription with credit-based metering and atomic RPC deduction
- Managed fal.ai credentials — no API key management for paid tiers
- Shopify JSON API integration with full gallery image extraction
- HTML scraper fallback for Amazon, Etsy, eBay, and any e-commerce site
- Auto-storyboard generation with role assignment, model routing, and voiceover scripts
- 5 narrative angles (convenience, quality, social proof, value, unboxing) randomly selected per import
- 5 templates — UGC Ad, Product Showcase, Explainer, Before/After, Social Proof
- Hook Lab — test 20+ hooks fast (sequential generation with rate-limit staggering), expand winners into full storyboards
- Batch variations — 8 hook styles, generate N versions with one click
- Script import — paste a script, AI splits into scenes and assigns models
- Per-scene voiceover — Kokoro ($0.02/1K chars) or ElevenLabs ($0.05/1K chars)
- Duration-aware word budgets, auto-fit speed control
- Background music — CassetteAI ($0.02/min) or Stable Audio 2.5
- Smart muting — video muted when TTS exists
- Style brief — global prompt prefix for lighting, color palette, character
- Reference images — character, style, product, and last-frame types with per-scene overrides
- Auto-chain — last frame from each scene automatically feeds into the next
- Drag-to-reorder scenes with re-indexing
- Trim controls per scene
- Side-by-side comparison — generate same scene with 2–4 models, pick the best
- Preview player — play full storyboard back-to-back
- Cost tracking — real-time spend display, per-model pricing, database-recorded history
- Project save/load — JSON export/import with field migration for compatibility
- Credit balance badge — live Stripe-synced credits in the UI with upgrade modal on insufficient funds
| Model | Provider | Aspect Ratios | Best For | Cost/sec |
|---|---|---|---|---|
| HunyuanVideo | Tencent | 16:9, 9:16 | Cheapest option | $0.075 |
| Wan 2.5 | Alibaba | 16:9, 9:16, 1:1 | Anime, stylized | $0.05 |
| MiniMax Live | MiniMax | Auto (no control) | Fast, cheap | $0.10 |
| Kling 2.5 Turbo | Kuaishou | 16:9, 9:16, 1:1 | Action, motion | $0.07 |
| Luma Ray 2 | Luma | All ratios | Dreamy, creative | $0.10 |
| Seedance 1.5 | ByteDance | All ratios | Audio, multi-shot | $0.08 |
| Veo 2 | 16:9, 9:16 | Photorealism | $0.25 | |
| Veo 3 | 16:9, 9:16 | Realism + audio | $0.40 |
src/
├── app/
│ ├── page.tsx # Main editor (1,290+ lines)
│ ├── waitlist/page.tsx # Landing page (/ for unauth'd users)
│ ├── auth/signin/page.tsx # Google SSO entry point
│ ├── api/
│ │ ├── auth/[...nextauth]/ # NextAuth v5 route
│ │ ├── billing/ # Stripe: balance, checkout, portal, sync, webhook
│ │ ├── projects/ # Cloud project CRUD
│ │ ├── storage/ # Supabase Storage upload/read
│ │ ├── generations/ # Generation history recording
│ │ ├── scrape/ # Product URL scraper
│ │ ├── normalize-images/ # Image validation + upscaling
│ │ ├── proxy-image/ # CORS bypass for restricted CDNs
│ │ ├── add-captions/ # Server-side caption burn-in
│ │ └── waitlist/ # Email capture to Supabase
│ └── middleware.ts # Session gate, rewrites / → /waitlist for unauth'd
├── components/ # 18 components
│ ├── SceneCard.tsx # Per-scene editor with role badges
│ ├── HookLab.tsx # 8-hook batch variation UI
│ ├── ProjectSidebar.tsx # ChatGPT-style auto-save sidebar
│ ├── CreditsBadge.tsx # Live credit balance display
│ ├── UpgradeModal.tsx # Stripe Checkout entry point
│ ├── BrandKitPanel.tsx # Per-client brand configuration
│ ├── CompareModal.tsx # Multi-model comparison
│ └── ...
├── lib/ # 22 modules
│ ├── fal.ts # fal.ai SDK + image proxy + queue polling
│ ├── auth.ts # NextAuth v5 + Supabase upsert
│ ├── stripe.ts # Tier config + customer ensure
│ ├── credits.ts # Atomic deduct + refill + ledger
│ ├── hook-lab.ts # Sequential hook generation
│ ├── avatar.ts # Avatar model adapters + TTS chain
│ ├── prompt-engine.ts # Role-specific prompt builder
│ ├── model-adapters.ts # Per-model parameter formatting
│ ├── style-engine.ts # Frame chaining + generation ordering
│ ├── storyboard-generator.ts # Product data → storyboard
│ ├── video-export.ts # FFmpeg.wasm stitching + audio mixing
│ ├── video-storage.ts # Supabase Storage persistence
│ ├── job-recovery.ts # Active job tracking for reload resume
│ ├── brand-kit.ts # Brand kit → prompt brief (visual overlay WIP)
│ └── ...
├── supabase/
│ └── migrations/
│ ├── 001_initial_schema.sql # Users, accounts, sessions, projects, history, waitlist
│ └── 002_billing.sql # Credits, transactions, atomic deduct RPC
└── types/
└── index.ts # Model catalog + type definitions
6,000+ lines of TypeScript. Full-stack Next.js 15 with client-side fal.ai generation, server-side billing/auth, and Supabase for persistence.
- Multi-model generation pipeline (8 models)
- Style Engine (last-frame chaining + style brief + reference images)
- Product URL → auto-storyboard (Shopify + HTML)
- Role-based prompt engine with anti-hallucination
- Per-scene voiceover with duration fitting + word budgets
- Templates, batch variations, script import
- FFmpeg.wasm multi-scene export with audio mixing
- Image normalization + CORS proxy
- Cost tracking, project save/load
- Google SSO via NextAuth v5 + Supabase
- Cloud project storage (Supabase) + ChatGPT-style sidebar
- Stripe billing with credit-based metering
- Hook Lab — 20+ hooks fast
- AI Avatar scenes + Auto-Captions
- Landing page + waitlist + Google SSO
- Managed fal.ai key server-side proxy (remove BYOK requirement for paid users)
- Agency pricing tiers ($199 / $499 / $1200) in Stripe + landing
- Brand kit visual watermark overlay on export
- Custom export filenames per client
- Multi-client workspace (client table + per-client brand kit switching)
- Public
/gallerypage with curated outputs
- Shopify App Store listing
- Team workspaces with shared brand kits
- Agency API for programmatic batch generation
- Script-to-storyboard LLM fine-tuned on winning ads
- A/B test recommendations from ad spend data
- Meta/TikTok Ads Manager integration
- Real-time preview during generation
- Custom model fine-tuning per brand
- Marketplace for templates and style presets
- Self-hosted enterprise edition with SSO
| Layer | Technology |
|---|---|
| Framework | Next.js 15, React 19 |
| Styling | Tailwind CSS 4 |
| Auth | NextAuth v5 (Google SSO, JWT sessions) |
| Database | Supabase Postgres + Storage |
| Billing | Stripe Checkout + webhooks + atomic credit RPC |
| AI Models | fal.ai (single SDK, 20+ models) |
| Video Export | FFmpeg.wasm (client) + server-side caption burn |
| TTS | Kokoro, ElevenLabs (via fal.ai) |
| Music | CassetteAI, Stable Audio 2.5 (via fal.ai) |
| Analytics | Vercel Analytics |
| Language | TypeScript throughout |
Contributions are welcome. Here's how to help:
- Report bugs — open an issue with the model name, error message, and scene config
- Add a model — add an entry to
MODEL_CATALOGintypes/index.tsand a builder inmodel-adapters.ts - Add a template — add to the
TEMPLATESarray intemplates.ts - Improve prompts — the prompt engine in
prompt-engine.tsdrives output quality. Better prompts = better ads. - Add a TTS voice or music provider — wire it in
audio.ts
# Dev setup
npm install
npm run dev # localhost:3000
npm run build # production buildSee .env.local.example for the full list. Required for OSS dev: none (falls back to BYOK fal.ai). Required for full hosted feature set:
# Auth
AUTH_GOOGLE_ID=
AUTH_GOOGLE_SECRET=
AUTH_SECRET=
# Supabase
NEXT_PUBLIC_SUPABASE_URL=
NEXT_PUBLIC_SUPABASE_ANON_KEY=
SUPABASE_SERVICE_ROLE_KEY=
# Stripe
STRIPE_SECRET_KEY=
STRIPE_WEBHOOK_SECRET=
STRIPE_PRICE_PREMIUM=
STRIPE_PRICE_PRO=
# (Agency tiers: STRIPE_PRICE_STARTER, STRIPE_PRICE_AGENCY_PRO, STRIPE_PRICE_ENTERPRISE — next release)
# Fal.ai (managed key for hosted users)
FAL_API_KEY=
The AI video space is moving too fast for any one company to own the whole stack. Models improve weekly. New providers launch monthly. An open-source storyboard layer that routes to the best model for each shot — and lets the community improve the prompt engineering — will outpace any closed tool.
CutAgent is the orchestration layer. The models are the instruments. You're the director.
MIT — use it, fork it, sell with it.
Built with fal.ai
Star this repo if you want to see AI video ads done right.