EchoForge is a Python CLI that turns audio into structured local knowledge artifacts.
The pipeline covers three steps:
- Fetch or stage audio from Feishu Minutes or a local file
- Submit the audio to a supported ASR provider for transcription and meeting understanding
- Render the results into Obsidian Markdown notes
The Python implementation under src/echoforge/ is the current pipeline. It has been verified end-to-end with Tingwu + R2 transit, and it also supports Doubao ASR.
Recent additions:
- Dual ASR provider support: Switch between Tingwu and Doubao via
ECHOFORGE_UNDERSTANDING_PROVIDER - Gemini post-processing: Optional summarization, chapter extraction, Q&A, and action items generated from the transcript markdown
- Long-audio segmentation: Files longer than 119 minutes are automatically split at silence points and processed in parallel
- State persistence: All runs are tracked in
outputs/runs.jsonwithlist-runsandinspect-runcommands
EchoForge also works with Feishu Minutes exports that already contain transcript and summary data. When feishu_minutes_sync exports a minute from the web page, EchoForge can render the standardized artifacts directly without sending the audio to a third-party ASR provider again.
One important integration constraint comes from the current provider APIs:
- Offline transcription tasks require a public
FileUrl - Local file paths are staged into
outputs/, and the provider still needs an externally reachable HTTP or HTTPS URL
That means process-file is implemented, but you must provide --media-url unless your source already exposes a downloadable URL.
EchoForge uses the dual-file structure settled on 2026-04-16:
- Summary note:
meetings/{date}-{title}.md - Transcript note:
meetings/Transcripts/{date}-{title}-transcript.md - Vault index:
EchoForge Index.md
The summary note links into transcript block anchors, so different upstream providers can share one rendering format.
EchoForge renders from a normalized artifact layer. Current standard files are:
transcription.jsonchapters.jsonsummarization.jsonmeeting_assistance.json
Different upstream sources can map into the same shape:
- Tingwu returns all four categories directly
- Doubao returns transcription, chapters, summarization, and information extraction URLs
- Feishu Minutes web export now produces:
transcript.vtttranscription.jsonsummarization.jsonchapters.json
This keeps the final Obsidian output stable across providers.
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"Copy .env.example to .env and fill in at least:
# ASR provider selection: tingwu | doubao
ECHOFORGE_UNDERSTANDING_PROVIDER=tingwu
# Tingwu
TINGWU_ACCESS_KEY_ID=...
TINGWU_ACCESS_KEY_SECRET=***
TINGWU_APP_KEY=...
# Doubao (when provider = doubao)
DOUBAO_APP_KEY=...
DOUBAO_ACCESS_KEY=...
# Obsidian
OBSIDIAN_VAULT_PATH=/path/to/vault
# Optional: Gemini post-processing
GEMINI_API_KEY=...
GEMINI_BASE_URL=https://generativelanguage.googleapis.com
GEMINI_MODEL=gemini-2.0-flash
GEMINI_ENABLE_SUMMARY=trueOptional Feishu settings:
FEISHU_MINUTES_SYNC_BIN=feishu-minutes-sync
FEISHU_MINUTES_SYNC_EXPORTS_DIR=./exportsFeishu Minutes web export workflow:
- Use
feishu_minutes_sync export-minute --token <minute_token> --fetch-mode web - Exported files land under
exports/<minute_token>/ - Render the transcript directly with
render-transcript, or feed the normalized JSON files into the full EchoForge renderer
# Process audio
python -m echoforge process-feishu <minute_token>
python -m echoforge process-file ./recording.ogg --media-url https://example.com/recording.ogg
# Re-render an existing run
python -m echoforge render <run_id>
# Render a standalone transcript JSON
python -m echoforge render-transcript ./transcription.json --title "导入转写" --output-vault ~/Obsidian/vault
# State management
python -m echoforge list-runs
python -m echoforge list-runs --status failed
python -m echoforge inspect-run <run_id>Render a transcript from Feishu Minutes standardized export:
python -m echoforge render-transcript \
../feishu_minutes_sync/exports/<minute_token>/transcription.json \
--title "会议标题" \
--output-vault ~/Obsidian/vault \
--note-name imported-transcript \
--source-label "Feishu Minutes WEBVTT"Use python -m echoforge --help for the full CLI.
EchoForge/
├── config/
├── outputs/
│ └── runs.json
├── src/echoforge/
└── tests/
Typical run artifacts:
outputs/runs/run_<timestamp>_<hash>/
├── run.json
├── media.ogg
└── results/
├── transcription.json
├── chapters.json
├── summarization.json
└── meeting_assistance.json
pytest