Lexi Cut is a desktop, text-based video editor. It transcribes footage, lets you edit by manipulating text, and keeps the timeline in sync with the underlying media. It also uses AI services to help with narrative ordering and visual descriptions.
- Import video/audio sources and transcribe them into words and sentences
- Edit by rearranging or excluding sentences and words
- Generate an initial assembly cut from AI analysis
- Attach visual descriptions to segments for faster browsing
- Export the resulting timeline
Lexi Cut is a Tauri desktop app with a React UI, a Rust backend, and an AI service layer. Data is persisted to local storage so projects can be re-opened and edited later.
flowchart LR
UI["ReactUI(Vite)"] --> Backend["TauriBackend(Rust)"]
Backend --> AI["AIservices"]
Backend --> Storage["LocalStorage"]
- UI: React + Vite, with Remotion for playback and timeline rendering.
- Backend: Tauri commands for filesystem access, hashing, caching, and exports.
- AI services: ElevenLabs (transcription), Anthropic (assembly cut), Gemini (visual descriptions), Late (upload). See
docs/AI_SERVICES.md. - Persistence: Projects are stored under
~/Documents/Lexi Cut/as JSON plus media metadata. Seedocs/project_save.md.
The end-to-end processing flow lives in src/api/processingPipeline.ts and runs when sources are added:
- CID pre-phase
- Compute a content hash (CID) per source in parallel.
- Wait for any existing background work tied to the same CID.
- Transcription (per source)
- Read the file via Tauri (
read_file_base64) and convert to a browserFile. - Call ElevenLabs transcription (cache-aware via CID).
- Map the transcript into word-level
Word[]. - Track sources with zero words as transcriptless (video-only).
- Read the file via Tauri (
- Grouping
- Group words by source into
SegmentGroup[]using sentence boundaries and duration limits. - Create a video-only group for transcriptless sources (0s -> duration).
- Group words by source into
- Optional visual descriptions
- If
VITE_GEMINI_API_KEYis set, extract frames and ask Gemini for time-ranged descriptions. - Attach descriptions to the source; failures are non-fatal and skipped.
- If
- Sentence and timeline prep
- Split words into sentence groups for timeline entries.
- Create sentences for transcriptless sources and mark them as B-roll (
no-speech).
- Ordering
- Produce a chronological baseline order for groups.
- Agentic assembly cut runs later in the UI to refine ordering.
A CID (content ID) is a deterministic hash of a source file’s bytes. It changes when the file changes, and stays stable when the file is identical. Lexi Cut generates CIDs locally through the Tauri backend and uses them to:
- Cache AI results like transcriptions and visual descriptions
- De-duplicate work across sessions and projects when the source file is unchanged
- Coordinate background processing and avoid duplicate in-flight work
- Cache-first behavior: transcription and description requests are keyed by CID, so repeated imports reuse cached results.
- Parallel CID computation: missing CIDs are computed concurrently before the pipeline starts.
- In-flight coordination: the pipeline waits for any background processing already running for the same CID before starting.
- Parallel visual descriptions: Gemini descriptions run concurrently per source and are skipped entirely when the API key is missing.
Frame extraction is handled by a Tauri command (extract_frames_base64) and uses a hybrid strategy:
- Keyframes first: uses
ffprobeto collect I-frame timestamps (natural scene boundaries). - Gap filling: if gaps are larger than 5 seconds, fill at ~1 fps.
- Deduping: avoid frames closer than 1 second apart.
- Subsampling: cap to 60 frames by evenly distributing timestamps.
- Extraction: uses
ffmpegto export JPEGs at selected timestamps, returned as base64 strings with timestamps.
This keeps the Gemini payload small while still sampling visually meaningful frames.
Core data types are defined in src/types/index.ts:
Source: media file metadata, optionalcid, and optionaldescriptions.Word: word-level transcript timing and confidence.Sentence: grouped words used as the primary editable unit.SegmentGroup: grouped words for assembly cut analysis.Timeline/TimelineEntry: serialized edit state.SourceDescription: time-ranged visual descriptions for a source.ProcessingProgress: progress updates for the pipeline.BrollClassification: flags for transcriptless or non-narrative content.
Lexi Cut caches expensive AI results using a local SQLite database (cache.db) in the Tauri app data directory.
- Key:
CID + data_type(CID is a hash of the source file contents) - Data types:
transcriptionanddescriptions - Purpose: Avoid re-transcribing or re-describing the same media if the file is unchanged
- Implementation:
src/api/cache.tswraps Tauri commands (get_cached,set_cached) backed bysrc-tauri/src/services/cache_db.rs
- Node.js (LTS) and a package manager (
pnpmrecommended,npmsupported) - Rust toolchain (stable) and the Tauri CLI
- React + Vite UI
- Remotion for video composition and playback
- Zustand for app state
- Tauri plugins for HTTP, dialog, and opener
Copy .env.example to .env and fill in keys:
VITE_ELEVENLABS_API_KEYVITE_ANTHROPIC_API_KEYVITE_GEMINI_API_KEYVITE_LATE_API_KEY
See docs/AI_SERVICES.md for API details and how each service is used.
Install dependencies:
pnpm install
Start the desktop app (Tauri):
pnpm tauri dev
Build the desktop app:
pnpm tauri build
You can also run the UI in a browser (limited, no native file access):
pnpm dev
- Network calls use the Tauri HTTP plugin to avoid CORS issues in the webview.
- Project data and exports are stored locally; see
docs/project_save.mdfor structure.
