Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 97 additions & 17 deletions .claude/commands/yt2pdf.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,22 +39,26 @@ If `hqdefault.jpg` fails, try `mqdefault.jpg`. If both fail, continue without th

## Step 3: Fetch Metadata & Extract Transcript

First, fetch video metadata (title, publish date, channel name) via yt-dlp:
First, fetch video metadata (title, publish date, channel name, language) via yt-dlp:

```bash
yt-dlp --dump-json --skip-download "https://youtube.com/watch?v=VIDEO_ID" 2>/dev/null | python3 -c "
import json,sys; d=json.load(sys.stdin)
print(json.dumps({'title':d.get('title',''),'uploader':d.get('uploader',''),'upload_date':d.get('upload_date',''),'duration':d.get('duration',0)}))"
print(json.dumps({'title':d.get('title',''),'uploader':d.get('uploader',''),'upload_date':d.get('upload_date',''),'duration':d.get('duration',0),'language':d.get('language','en')}))"
```

Parse the JSON to get: title, uploader, upload_date (YYYYMMDD → YYYY-MM-DD).
Parse the JSON to get: title, uploader, upload_date (YYYYMMDD → YYYY-MM-DD), language.

Then extract transcript:
Then extract transcript with timestamps and the video's original language:

```bash
python3 scripts/yt/get_transcript.py VIDEO_ID
python3 scripts/yt/get_transcript.py VIDEO_ID --lang LANGUAGE --timestamps
```

- Use the `language` field from metadata (e.g. `zh-tw`, `en`, `ja`). If missing, default to `en`.
- The `--timestamps` flag preserves `[MM:SS]` markers every ~30 seconds for timestamp links in the summary.
- If the transcript starts with `[NO_TIMESTAMPS]`, timestamps are unavailable (Whisper fallback) — skip timestamp links in Step 4.

Capture the stdout output as the transcript text. If it fails, reply with the error and stop.

Update progress: "Transcript extracted (N chars). Generating summary..."
Expand All @@ -66,9 +70,15 @@ Using the transcript, generate markdown summary file(s) in `output/youtube/YYYY-
**Important**:

- Each summary MUST include the thumbnail image reference (`thumb.jpg` — embedded as base64 in PDF automatically)
- Include metadata: title, YouTube link, published date, uploader/publisher
- ALL metadata fields are **required**: title, YouTube link, published date, uploader/publisher, tags
- Include 3-5 **tags**: lowercase English topic tags covering companies (e.g. nvidia, openai), technologies (e.g. inference, rag), categories (e.g. policy, research, product, open-source)

**Timestamp links**: The transcript contains `[MM:SS]` markers. Convert them to YouTube timestamp links in the summary:

- `[MM:SS]` → `[[MM:SS](https://youtube.com/watch?v=VIDEO_ID&t=TOTAL_SECONDS)]` where TOTAL_SECONDS = minutes × 60 + seconds
- Place timestamps at natural topic boundaries — aim for 5-15 per summary, not every marker
- If the transcript starts with `[NO_TIMESTAMPS]`, omit all timestamp links

### If lang includes `en` → write `output/youtube/YYYY-MM-DD/VIDEO_ID/summary_en.md`

Each metadata field MUST be on its own line. Use HTML `<br>` line breaks to ensure they render separately in the PDF:
Expand All @@ -80,17 +90,52 @@ Each metadata field MUST be on its own line. Use HTML `<br>` line breaks to ensu

**Publisher**: Uploader Name<br>
**Source**: [Watch on YouTube](https://youtube.com/watch?v=VIDEO_ID)<br>
**Published**: 2026-04-04<br>
**Published**: YYYY-MM-DD<br>
**Tags**: tag1, tag2, tag3, tag4, tag5

## Summary

3-5 paragraph English summary covering:
- Key points and main arguments
- Notable quotes or data points
- Conclusions and takeaways
### 🔍 Section Title [[MM:SS](url&t=N)]

- Key concept or argument
- Supporting detail, quote, or data point

### 📌 Another Section [[MM:SS](url&t=N)]

#### Sub-topic (when section has multiple distinct points)

1. Enumerated item one
2. Enumerated item two

- Practical takeaway or example

### 💡 Conclusion / Key Takeaways

- Takeaway 1
- Takeaway 2

🔗 [Watch the full video](https://youtube.com/watch?v=VIDEO_ID)
```

**Structure requirements:**

- Generate a detailed, structured summary (**800-1200 words**)
- Use `##` for the main "Summary" heading
- Use `###` for each major topic/section, prefixed with a relevant emoji
- Use `####` for sub-topics when a section has multiple distinct points
- Use bullet points (`-`) for key concepts, practical takeaways, notable quotes
- Use numbered lists (`1.`) for sequential or enumerated content (e.g. "3 steps", "4 failure modes")
- Include YouTube timestamp links at key section headings and significant points
- End with a `### 💡 Conclusion` or `### 💡 Key Takeaways` section
- Add a video link at the bottom

**Content requirements:**

- Cover ALL major topics discussed, not just the first few
- Include specific examples, data points, and direct quotes when present
- Preserve technical terms and proper nouns accurately
- Each `###` section should have 2-4 bullet points minimum

### If lang includes `zh-tw` → write `output/youtube/YYYY-MM-DD/VIDEO_ID/summary_zh-tw.md`

```markdown
Expand All @@ -100,18 +145,53 @@ Each metadata field MUST be on its own line. Use HTML `<br>` line breaks to ensu

**頻道**: Uploader Name<br>
**來源**: [在 YouTube 上觀看](https://youtube.com/watch?v=VIDEO_ID)<br>
**發布日期**: 2026-04-04<br>
**發布日期**: YYYY-MM-DD<br>
**標籤**: tag1, tag2, tag3, tag4, tag5

## 摘要

3-5 段繁體中文摘要,涵蓋:
- 重點與主要論點
- 值得注意的引用或數據
- 結論與要點
### 🔍 段落標題 [[MM:SS](url&t=N)]

- 核心概念或論點
- 具體範例、數據或引述

### 📌 另一段落 [[MM:SS](url&t=N)]

#### 子主題(當段落有多個重點時)

1. 列舉項目一
2. 列舉項目二

- 實踐方式或關鍵心得

### 💡 總結 / 重點整理

- 要點一
- 要點二

🔗 [觀看完整影片](https://youtube.com/watch?v=VIDEO_ID)
```

**Important**: Use Traditional Chinese (繁體中文) only. Never use Simplified Chinese.
**結構要求:**

- 生成詳細的結構化摘要(**800-1200 字**)
- 使用 `##` 作為「摘要」主標題
- 使用 `###` 標示每個主要主題,前方加上相關 emoji
- 使用 `####` 標示子主題(當一個段落有多個重點時)
- 使用項目符號(`-`)列出核心概念、實踐方式、關鍵引述
- 使用編號列表(`1.`)呈現有先後順序或列舉性質的內容
- 在關鍵段落標題與重要論點旁加入 YouTube 時間戳連結
- 以 `### 💡 總結` 或 `### 💡 重點整理` 作為結尾段落
- 底部加上影片連結

**內容要求:**

- 涵蓋影片中討論的**所有**主要主題,不只前幾個
- 保留具體範例、數據與直接引述
- 專有名詞保持原文(可加中文說明)
- 每個 `###` 段落至少包含 2-4 個項目符號

**務必使用繁體中文,嚴禁簡體中文。所有元資料欄位皆為必填。**

Write file(s) using the Write tool.

Expand Down
Binary file added docs/screenshots/telegram/yt2pdf_reply_zh-tw.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 6 additions & 2 deletions external_plugins/telegram-channel/server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
* Telegram's Bot API has no history or search. Reply-only tools.
*/

// Fork identifier — shown in startup log to distinguish from official plugin.
const FORK_TAG = 'claude-code-channels'

import { Server } from '@modelcontextprotocol/sdk/server/index.js'
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
import {
Expand Down Expand Up @@ -625,6 +628,7 @@ mcp.setRequestHandler(CallToolRequestSchema, async req => {
sentIds.length === 1
? `sent (id: ${sentIds[0]})`
: `sent ${sentIds.length} parts (ids: ${sentIds.join(', ')})`

return { content: [{ type: 'text', text: result }] }
}
case 'react': {
Expand Down Expand Up @@ -740,7 +744,7 @@ bot.command('status', async ctx => {

if (access.allowFrom.includes(senderId)) {
const name = from.username ? `@${from.username}` : senderId
await ctx.reply(`Paired as ${name}.`)
await ctx.reply(`Paired as ${name}.\nFork: ${FORK_TAG}`)
return
}

Expand Down Expand Up @@ -1121,7 +1125,7 @@ void (async () => {
await bot.start({
onStart: info => {
botUsername = info.username
process.stderr.write(`telegram channel: polling as @${info.username}\n`)
process.stderr.write(`telegram channel: polling as @${info.username} [${FORK_TAG}]\n`)
void bot.api.setMyCommands(
[
{ command: 'start', description: 'Welcome and setup guide' },
Expand Down
27 changes: 15 additions & 12 deletions lib/sessions/commands.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/**
* Session bot commands — parsed and executed at the broker layer.
*
* Commands are prefixed with `/session` and handled without LLM invocation.
* Commands are prefixed with `/memory` (legacy `/session` also accepted) and handled without LLM invocation.
*/

import { existsSync } from 'fs'
Expand All @@ -19,12 +19,15 @@ export interface SessionCommand {
args: string[]
}

/** Parse a /session command from message text. Returns null if not a session command. */
/** Parse a /memory (or legacy /session) command from message text. Returns null if not a match. */
export function parseSessionCommand(text: string): SessionCommand | null {
const trimmed = text.trim()
if (!trimmed.startsWith('/session')) return null
let prefix: string
if (trimmed.startsWith('/memory')) prefix = '/memory'
else if (trimmed.startsWith('/session')) prefix = '/session'
else return null

const parts = trimmed.slice('/session'.length).trim().split(/\s+/)
const parts = trimmed.slice(prefix.length).trim().split(/\s+/)
const action = parts[0] ?? 'help'
const args = parts.slice(1)

Expand Down Expand Up @@ -97,7 +100,7 @@ function handleProfile(stateDir: string, userId: string): string {

function handleForget(stateDir: string, args: string[]): string {
const slug = args[0]
if (!slug) return 'Usage: /session forget <topic-slug>'
if (!slug) return 'Usage: /memory forget <topic-slug>'
deleteTopic(stateDir, slug)
return `Topic "${slug}" has been deleted.`
}
Expand All @@ -116,12 +119,12 @@ function handleHelp(): string {
return [
'Session Commands',
'────────────────',
'/session status — Show session stats',
'/session clear — Clear your short-term memory',
'/session clear all — Clear all your data (STM + LTM)',
'/session profile — Show your stored profile',
'/session forget <topic> — Delete a topic note',
'/session export — Export your data',
'/session help — Show this help',
'/memory status — Show session stats',
'/memory clear — Clear your short-term memory',
'/memory clear all — Clear all your data (STM + LTM)',
'/memory profile — Show your stored profile',
'/memory forget <topic> — Delete a topic note',
'/memory export — Export your data',
'/memory help — Show this help',
].join('\n')
}
82 changes: 70 additions & 12 deletions scripts/yt/get_transcript.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ def download_subtitles(video_url: str, out_dir: Path, lang: str = "en") -> Path
)
if srt_path.exists() and srt_path.stat().st_size > 0:
return srt_path
# Glob fallback — yt-dlp may normalize lang codes differently
for candidate in out_dir.glob("video.*.srt"):
if candidate.stat().st_size > 0:
return candidate

# Try auto-generated subs
subprocess.run(
Expand All @@ -73,6 +77,9 @@ def download_subtitles(video_url: str, out_dir: Path, lang: str = "en") -> Path
)
if srt_path.exists() and srt_path.stat().st_size > 0:
return srt_path
for candidate in out_dir.glob("video.*.srt"):
if candidate.stat().st_size > 0:
return candidate

return None

Expand All @@ -93,6 +100,34 @@ def srt_to_text(srt_path: Path) -> str:
return "\n".join(lines)


def srt_to_timestamped_text(srt_path: Path, interval: int = 30) -> str:
"""Convert SRT to text with [MM:SS] markers every ~interval seconds."""
content = srt_path.read_text(encoding="utf-8")
result = []
last_emitted = -interval # emit on first entry
current_ts_seconds = 0

for line in content.splitlines():
line = line.strip()
if not line or re.match(r"^\d+$", line):
continue
ts_match = re.match(r"(\d{2}):(\d{2}):(\d{2}),\d{3}\s*-->", line)
if ts_match:
h, m, s = int(ts_match.group(1)), int(ts_match.group(2)), int(ts_match.group(3))
current_ts_seconds = h * 3600 + m * 60 + s
continue
# Text line — prepend timestamp marker if interval elapsed
if current_ts_seconds - last_emitted >= interval:
mm = current_ts_seconds // 60
ss = current_ts_seconds % 60
result.append(f"[{mm:02d}:{ss:02d}] {line}")
last_emitted = current_ts_seconds
else:
result.append(line)

return "\n".join(result)


def whisper_transcribe_hf(video_url: str, out_dir: Path) -> str | None:
"""Download audio and transcribe via HuggingFace Inference API."""
hf_token = os.environ.get("HF_TOKEN")
Expand Down Expand Up @@ -163,28 +198,43 @@ def whisper_transcribe_hf(video_url: str, out_dir: Path) -> str | None:
return None


def get_transcript(video_id: str) -> str | None:
LANG_FALLBACKS = {
"zh-tw": ["zh-Hant", "zh-TW", "zh", "en"],
"zh": ["zh-Hans", "zh-CN", "zh", "en"],
"ja": ["ja", "en"],
"ko": ["ko", "en"],
}


def get_transcript(
video_id: str, lang: str = "en", timestamps: bool = False
) -> str | None:
"""Get transcript for a video using subtitle download with Whisper fallback."""
video_url = f"https://www.youtube.com/watch?v={video_id}"
convert = srt_to_timestamped_text if timestamps else srt_to_text
langs_to_try = LANG_FALLBACKS.get(lang, [lang, "en"]) if lang != "en" else ["en"]

with tempfile.TemporaryDirectory(prefix="yt_transcript_") as tmpdir:
tmp = Path(tmpdir)

# Try subtitles first (English)
print(f"Trying EN subtitles for {video_id}...", file=sys.stderr)
srt = download_subtitles(video_url, tmp, lang="en")
if srt:
text = srt_to_text(srt)
if len(text) > 100:
print(f"Got subtitle transcript ({len(text)} chars)", file=sys.stderr)
return text
for try_lang in langs_to_try:
print(f"Trying {try_lang} subtitles for {video_id}...", file=sys.stderr)
srt = download_subtitles(video_url, tmp, lang=try_lang)
if srt:
text = convert(srt)
if len(text) > 100:
print(
f"Got subtitle transcript in {try_lang} ({len(text)} chars)",
file=sys.stderr,
)
return text

# Whisper fallback
# Whisper fallback (no timestamps available)
print(f"Falling back to Whisper for {video_id}...", file=sys.stderr)
text = whisper_transcribe_hf(video_url, tmp)
if text and len(text) > 100:
print(f"Got Whisper transcript ({len(text)} chars)", file=sys.stderr)
return text
return "[NO_TIMESTAMPS]\n" + text if timestamps else text

print(f"No transcript available for {video_id}", file=sys.stderr)
return None
Expand All @@ -193,9 +243,17 @@ def get_transcript(video_id: str) -> str | None:
def main():
parser = argparse.ArgumentParser(description="Extract YouTube video transcript")
parser.add_argument("video_id", help="YouTube video ID")
parser.add_argument(
"--lang", default="en", help="Subtitle language (default: en)"
)
parser.add_argument(
"--timestamps",
action="store_true",
help="Preserve [MM:SS] timestamp markers (~30s intervals)",
)
args = parser.parse_args()

transcript = get_transcript(args.video_id)
transcript = get_transcript(args.video_id, lang=args.lang, timestamps=args.timestamps)
if transcript:
print(transcript)
else:
Expand Down
Loading
Loading