Japanese video/audio processing pipeline with transcription, translation, and subtitle generation.
- 🎙️ Transcribe Japanese audio using OpenAI GPT-4o
- 🌏 Translate Japanese to Chinese using X.AI Grok-3
- 📝 Generate aligned bilingual subtitles with stable-ts (Whisper large-v3)
- 🎬 Create videos with embedded subtitles using ffmpeg
# Process all media files in input directory
bash main/main.sh ./main/input ./main/output
# Debug mode with verbose output
DEBUG=1 bash main/main.sh ./main/input ./main/outputpip install openai stable-whisper numpy numba- ffmpeg
- stable-ts
- OpenAI API key (configure in
main/scripts/transcribe.py) - X.AI API key (configure in
main/scripts/translate_x.py)
- Video: mp4, avi, mkv, mov, webm
- Audio: mp3, wav
For each processed file, outputs are created in <output_dir>/<filename>/:
audio.wav- Extracted audioja_lines.txt- Transcribed Japanese texttranslations.json- Chinese translationsaligned_ja.srt- Time-aligned Japanese subtitlessubtitles.srt- Final bilingual subtitlesfinal_with_subtitles.mp4- Video with embedded subtitles
After processing, you can manually edit subtitles:
- Edit
subtitles_editable.srtin the output directory - Copy back:
cp subtitles_editable.srt subtitles.srt - Re-encode video:
bash ffmpeg_command.txt
MIT