Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
- Transcripts: `--timestamps` adds segment-level timings (`transcriptSegments` + `transcriptTimedText`) for YouTube, podcasts, and embedded captions.
- Media-aware summarization in the Side Panel: Page vs Video/Audio dropdown, automatic media preference on video sites, plus visible word count/duration.
- CLI: transcribe local audio/video files with mtime-aware transcript cache invalidation (thanks @mvance!).
- CLI: add Cursor Agent CLI provider (`cli/agent`, `--cli agent`).
- Browser extension: add Firefox sidebar build + multi-browser config (#31, thanks @vlnd0).
- Chrome automation: add artifacts tool + REPL helpers for persistent session files (notes/JSON/CSV) and downloads.
- Chrome automation: expand navigate tool with list/switch tab support and return matching skills after navigation.
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ Use `summarize --help` or `summarize help` for the full help text.
- `--length short|medium|long|xl|xxl|s|m|l|<chars>`
- `--language, --lang <language>`: output language (`auto` = match source)
- `--max-output-tokens <count>`: hard cap for LLM output tokens
- `--cli [provider]`: use a CLI provider (`--model cli/<provider>`). If omitted, uses auto selection with CLI enabled.
- `--cli [provider]`: use a CLI provider (`--model cli/<provider>`). Supports `claude`, `gemini`, `codex`, `agent`. If omitted, uses auto selection with CLI enabled.
- `--stream auto|on|off`: stream LLM output (`auto` = TTY only; disabled in `--json` mode)
- `--plain`: keep raw output (no ANSI/OSC Markdown rendering)
- `--no-color`: disable ANSI colors
Expand Down Expand Up @@ -272,7 +272,7 @@ Why: CLI adds ~4s latency per attempt and higher variance.
Shortcut: `--cli` (with no provider) uses auto selection with CLI enabled.

When enabled, auto prepends CLI attempts in the order listed in `cli.enabled`
(recommended: `["gemini"]`), then tries the native provider candidates
(recommended: `["gemini"]` or `["agent"]`), then tries the native provider candidates
(with OpenRouter fallbacks when configured).

Enable CLI attempts:
Expand Down
16 changes: 12 additions & 4 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,14 @@ read_when:

# CLI models

Summarize can use installed CLIs (Claude, Codex, Gemini) as local model backends.
Summarize can use installed CLIs (Claude, Codex, Gemini, Cursor Agent) as local model backends.

## Model ids

- `cli/claude/<model>` (e.g. `cli/claude/sonnet`)
- `cli/codex/<model>` (e.g. `cli/codex/gpt-5.2`)
- `cli/gemini/<model>` (e.g. `cli/gemini/gemini-3-flash-preview`)
- `cli/agent/<model>` (e.g. `cli/agent/gpt-5.2`)

Use `--cli [provider]` (case-insensitive) for the provider default, or `--model cli/<provider>/<model>` to pin a model.
If `--cli` is provided without a provider, auto selection is used with CLI enabled.
Expand All @@ -22,12 +23,12 @@ If `--cli` is provided without a provider, auto selection is used with CLI enabl
Auto mode does **not** use CLIs unless you set `cli.enabled` in config.

Why: CLI adds ~4s latency per attempt and higher variance.
Recommendation: enable only Gemini unless you have a reason to add others.
Recommendation: enable only Gemini or Agent unless you have a reason to add others.

Gemini CLI performance: summarize sets `GEMINI_CLI_NO_RELAUNCH=true` for Gemini CLI runs to avoid a costly self-relaunch (can be overridden by setting it yourself).

When enabled, auto prepends CLI attempts in the order listed in `cli.enabled`
(recommended: `["gemini"]`).
(recommended: `["gemini"]` or `["agent"]`).

Enable CLI attempts:

Expand All @@ -52,6 +53,7 @@ Note: when `cli.enabled` is set, it also acts as an allowlist for explicit `--cl
Binary lookup:

- `CLAUDE_PATH`, `CODEX_PATH`, `GEMINI_PATH` (optional overrides)
- `AGENT_PATH` (optional override)
- Otherwise uses `PATH`

## Attachments (images/files)
Expand All @@ -62,19 +64,24 @@ path-based prompt and enables the required tool flags:
- Claude: `--tools Read --dangerously-skip-permissions`
- Gemini: `--yolo` and `--include-directories <dir>`
- Codex: `codex exec --output-last-message ...` and `-i <image>` for images
- Agent: uses built-in file tools in `agent --print` mode (no extra flags)

## Config

```json
{
"cli": {
"enabled": ["claude", "gemini", "codex"],
"enabled": ["claude", "gemini", "codex", "agent"],
"codex": { "model": "gpt-5.2" },
"gemini": { "model": "gemini-3-flash-preview", "extraArgs": ["--verbose"] },
"claude": {
"model": "sonnet",
"binary": "/usr/local/bin/claude",
"extraArgs": ["--verbose"]
},
"agent": {
"model": "gpt-5.2",
"binary": "/usr/local/bin/agent"
}
}
}
Expand All @@ -84,6 +91,7 @@ Notes:

- CLI output is treated as text only (no token accounting).
- If a CLI call fails, auto mode falls back to the next candidate.
- Cursor Agent CLI uses the `agent` binary and relies on Cursor CLI auth (login or `CURSOR_API_KEY`).

## Generate free preset (OpenRouter)

Expand Down
7 changes: 4 additions & 3 deletions docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,17 +272,18 @@ Examples:
```json
{
"cli": {
"enabled": ["gemini"],
"enabled": ["gemini", "agent"],
"codex": { "model": "gpt-5.2" },
"claude": { "binary": "/usr/local/bin/claude", "extraArgs": ["--verbose"] }
"claude": { "binary": "/usr/local/bin/claude", "extraArgs": ["--verbose"] },
"agent": { "binary": "/usr/local/bin/agent", "model": "gpt-5.2" }
}
}
```

Notes:

- `cli.enabled` is an allowlist (auto uses CLIs only when set; explicit `--cli` / `--model cli/...` must be included).
- Recommendation: keep `cli.enabled` to `["gemini"]` unless you have a reason to add others (extra latency/variance).
- Recommendation: keep `cli.enabled` to `["gemini"]` or `["agent"]` unless you have a reason to add others (extra latency/variance).
- `cli.<provider>.binary` overrides CLI binary discovery.
- `cli.<provider>.extraArgs` appends extra CLI args.

Expand Down
5 changes: 3 additions & 2 deletions docs/llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ installed, auto mode can use local CLI models when `cli.enabled` is set (see `do
- `ANTHROPIC_API_KEY` (required for `anthropic/...` models)
- `ANTHROPIC_BASE_URL` (optional; override Anthropic API endpoint)
- `SUMMARIZE_MODEL` (optional; overrides default model selection)
- `CLAUDE_PATH` / `CODEX_PATH` / `GEMINI_PATH` (optional; override CLI binary paths)
- `CLAUDE_PATH` / `CODEX_PATH` / `GEMINI_PATH` / `AGENT_PATH` (optional; override CLI binary paths)

## Flags

Expand All @@ -39,6 +39,7 @@ installed, auto mode can use local CLI models when `cli.enabled` is set (see `do
- `cli/codex/gpt-5.2`
- `cli/claude/sonnet`
- `cli/gemini/gemini-3-flash-preview`
- `cli/agent/gpt-5.2`
- `google/gemini-3-flash-preview`
- `openai/gpt-5-mini`
- `zai/glm-4.7`
Expand All @@ -47,7 +48,7 @@ installed, auto mode can use local CLI models when `cli.enabled` is set (see `do
- `anthropic/claude-sonnet-4-5`
- `openrouter/meta-llama/llama-3.3-70b-instruct:free` (force OpenRouter)
- `--cli [provider]`
- Examples: `--cli claude`, `--cli Gemini`, `--cli codex` (equivalent to `--model cli/<provider>`); `--cli` alone uses auto selection with CLI enabled.
- Examples: `--cli claude`, `--cli Gemini`, `--cli codex`, `--cli agent` (equivalent to `--model cli/<provider>`); `--cli` alone uses auto selection with CLI enabled.
- `--model auto`
- See `docs/model-auto.md`
- `--model <preset>`
Expand Down
8 changes: 6 additions & 2 deletions src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ import { isCliThemeName, listCliThemes } from './tty/theme.js'

export type AutoRuleKind = 'text' | 'website' | 'youtube' | 'image' | 'video' | 'file'
export type VideoMode = 'auto' | 'transcript' | 'understand'
export type CliProvider = 'claude' | 'codex' | 'gemini'
export type CliProvider = 'claude' | 'codex' | 'gemini' | 'agent'
export type CliProviderConfig = {
binary?: string
extraArgs?: string[]
Expand All @@ -17,6 +17,7 @@ export type CliConfig = {
claude?: CliProviderConfig
codex?: CliProviderConfig
gemini?: CliProviderConfig
agent?: CliProviderConfig
}

export type OpenAiConfig = {
Expand Down Expand Up @@ -215,7 +216,7 @@ function parseAutoRuleKind(value: unknown): AutoRuleKind | null {

function parseCliProvider(value: unknown, path: string): CliProvider {
const trimmed = typeof value === 'string' ? value.trim().toLowerCase() : ''
if (trimmed === 'claude' || trimmed === 'codex' || trimmed === 'gemini') {
if (trimmed === 'claude' || trimmed === 'codex' || trimmed === 'gemini' || trimmed === 'agent') {
return trimmed as CliProvider
}
throw new Error(`Invalid config file ${path}: unknown CLI provider "${String(value)}".`)
Expand Down Expand Up @@ -852,6 +853,7 @@ export function loadSummarizeConfig({ env }: { env: Record<string, string | unde
const claude = value.claude ? parseCliProviderConfig(value.claude, path, 'claude') : undefined
const codex = value.codex ? parseCliProviderConfig(value.codex, path, 'codex') : undefined
const gemini = value.gemini ? parseCliProviderConfig(value.gemini, path, 'gemini') : undefined
const agent = value.agent ? parseCliProviderConfig(value.agent, path, 'agent') : undefined
const promptOverride =
typeof value.promptOverride === 'string' && value.promptOverride.trim().length > 0
? value.promptOverride.trim()
Expand All @@ -868,6 +870,7 @@ export function loadSummarizeConfig({ env }: { env: Record<string, string | unde
claude ||
codex ||
gemini ||
agent ||
promptOverride ||
typeof allowTools === 'boolean' ||
cwd ||
Expand All @@ -877,6 +880,7 @@ export function loadSummarizeConfig({ env }: { env: Record<string, string | unde
...(claude ? { claude } : {}),
...(codex ? { codex } : {}),
...(gemini ? { gemini } : {}),
...(agent ? { agent } : {}),
...(promptOverride ? { promptOverride } : {}),
...(typeof allowTools === 'boolean' ? { allowTools } : {}),
...(cwd ? { cwd } : {}),
Expand Down
1 change: 1 addition & 0 deletions src/daemon/env-snapshot.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ const ENV_KEYS = [
'CLAUDE_PATH',
'CODEX_PATH',
'GEMINI_PATH',
'AGENT_PATH',
'UVX_PATH',
] as const

Expand Down
57 changes: 56 additions & 1 deletion src/llm/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,14 @@ const DEFAULT_BINARIES: Record<CliProvider, string> = {
claude: 'claude',
codex: 'codex',
gemini: 'gemini',
agent: 'agent',
}

const PROVIDER_PATH_ENV: Record<CliProvider, string> = {
claude: 'CLAUDE_PATH',
codex: 'CODEX_PATH',
gemini: 'GEMINI_PATH',
agent: 'AGENT_PATH',
}

type RunCliModelOptions = {
Expand Down Expand Up @@ -50,8 +58,16 @@ export function resolveCliBinary(
env: Record<string, string | undefined>
): string {
const providerConfig =
provider === 'claude' ? config?.claude : provider === 'codex' ? config?.codex : config?.gemini
provider === 'claude'
? config?.claude
: provider === 'codex'
? config?.codex
: provider === 'gemini'
? config?.gemini
: config?.agent
if (isNonEmptyString(providerConfig?.binary)) return providerConfig.binary.trim()
const pathKey = PROVIDER_PATH_ENV[provider]
if (isNonEmptyString(env[pathKey])) return env[pathKey].trim()
const envKey = `SUMMARIZE_CLI_${provider.toUpperCase()}`
if (isNonEmptyString(env[envKey])) return env[envKey].trim()
return DEFAULT_BINARIES[provider]
Expand Down Expand Up @@ -319,6 +335,45 @@ export async function runCliModel({
throw new Error('CLI returned empty output')
}

if (provider === 'agent') {
args.push('--print', '--output-format', 'json')
if (!allowTools) {
args.push('--mode', 'ask')
}
if (model && model.trim().length > 0) {
args.push('--model', model.trim())
}
args.push(prompt)
const { stdout } = await execCliWithInput({
execFileImpl: execFileFn,
cmd: binary,
args,
input: '',
timeoutMs,
env: effectiveEnv,
cwd,
})
const trimmed = stdout.trim()
if (!trimmed) {
throw new Error('CLI returned empty output')
}
const parsed = parseJsonFromOutput(trimmed)
if (parsed && typeof parsed === 'object') {
const payload = parsed as Record<string, unknown>
const resultText =
payload.result ??
payload.response ??
payload.output ??
payload.message ??
payload.text ??
null
if (typeof resultText === 'string' && resultText.trim().length > 0) {
return { text: resultText.trim(), usage: null, costUsd: null }
}
}
return { text: trimmed, usage: null, costUsd: null }
}

if (model && model.trim().length > 0) {
args.push('--model', model.trim())
}
Expand Down
13 changes: 10 additions & 3 deletions src/model-auto.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ export type AutoModelAttempt = {
| 'CLI_CLAUDE'
| 'CLI_CODEX'
| 'CLI_GEMINI'
| 'CLI_AGENT'
debug: string
}

Expand Down Expand Up @@ -197,6 +198,7 @@ const DEFAULT_CLI_MODELS: Record<CliProvider, string> = {
claude: 'sonnet',
codex: 'gpt-5.2',
gemini: 'gemini-3-flash-preview',
agent: 'gpt-5.2',
}

function isCliProviderEnabled(provider: CliProvider, config: SummarizeConfig | null): boolean {
Expand All @@ -223,7 +225,8 @@ function parseCliCandidate(
.map((entry) => entry.trim())
if (parts.length < 2) return null
const provider = parts[1]?.toLowerCase()
if (provider !== 'claude' && provider !== 'codex' && provider !== 'gemini') return null
if (provider !== 'claude' && provider !== 'codex' && provider !== 'gemini' && provider !== 'agent')
return null
const model = parts.slice(2).join('/').trim()
return { provider, model: model.length > 0 ? model : null }
}
Expand All @@ -243,7 +246,9 @@ function requiredEnvForCandidate(modelId: string): AutoModelAttempt['requiredEnv
? 'CLI_CODEX'
: parsed.provider === 'gemini'
? 'CLI_GEMINI'
: 'CLI_CLAUDE'
: parsed.provider === 'agent'
? 'CLI_AGENT'
: 'CLI_CLAUDE'
}
if (isCandidateOpenRouter(modelId)) return 'OPENROUTER_API_KEY'
const parsed = parseGatewayStyleModelId(normalizeGatewayStyleModelId(modelId))
Expand Down Expand Up @@ -365,7 +370,9 @@ function prependCliCandidates({
? cli?.gemini?.model
: provider === 'codex'
? cli?.codex?.model
: cli?.claude?.model
: provider === 'agent'
? cli?.agent?.model
: cli?.claude?.model
add(provider, modelOverride)
}
if (cliCandidates.length === 0) return candidates
Expand Down
18 changes: 15 additions & 3 deletions src/model-spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ const DEFAULT_CLI_MODELS: Record<CliProvider, string> = {
claude: 'sonnet',
codex: 'gpt-5.2',
gemini: 'gemini-3-flash-preview',
agent: 'gpt-5.2',
}

export type FixedModelSpec =
Expand Down Expand Up @@ -39,7 +40,7 @@ export type FixedModelSpec =
llmModelId: null
openrouterProviders: null
forceOpenRouter: false
requiredEnv: 'CLI_CLAUDE' | 'CLI_CODEX' | 'CLI_GEMINI'
requiredEnv: 'CLI_CLAUDE' | 'CLI_CODEX' | 'CLI_GEMINI' | 'CLI_AGENT'
cliProvider: CliProvider
cliModel: string | null
}
Expand Down Expand Up @@ -100,14 +101,25 @@ export function parseRequestedModelId(raw: string): RequestedModel {
.map((part) => part.trim())
.filter((part) => part.length > 0)
const providerRaw = parts[1]?.toLowerCase() ?? ''
if (providerRaw !== 'claude' && providerRaw !== 'codex' && providerRaw !== 'gemini') {
if (
providerRaw !== 'claude' &&
providerRaw !== 'codex' &&
providerRaw !== 'gemini' &&
providerRaw !== 'agent'
) {
throw new Error(`Invalid CLI model id "${trimmed}". Expected cli/<provider>/<model>.`)
}
const cliProvider = providerRaw as CliProvider
const requestedModel = parts.slice(2).join('/').trim()
const cliModel = requestedModel.length > 0 ? requestedModel : DEFAULT_CLI_MODELS[cliProvider]
const requiredEnv =
cliProvider === 'claude' ? 'CLI_CLAUDE' : cliProvider === 'codex' ? 'CLI_CODEX' : 'CLI_GEMINI'
cliProvider === 'claude'
? 'CLI_CLAUDE'
: cliProvider === 'codex'
? 'CLI_CODEX'
: cliProvider === 'gemini'
? 'CLI_GEMINI'
: 'CLI_AGENT'
const userModelId = `cli/${cliProvider}/${cliModel}`
return {
kind: 'fixed',
Expand Down
Loading