feat: AnalysisRunner interface + ClaudeNativeRunner + ProviderRunner by melagiri · Pull Request #246 · melagiri/code-insights

melagiri · 2026-03-29T04:35:54Z

What

Implements the runner abstraction layer for Issue #239 (Phase 12, v4.8.0).

Why

The upcoming insights CLI command needs to run analysis via two backends (claude -p native and configured LLM provider) without the command code caring which is used. This PR delivers the interface and both implementations.

How

runner-types.ts — AnalysisRunner interface with RunAnalysisParams / RunAnalysisResult. Adding a new runner (e.g. CursorNativeRunner) means implementing this interface only.

native-runner.ts — ClaudeNativeRunner using execFileSync (not exec — shell injection prevention). Writes system prompt and JSON schema to temp files, pipes conversation via stdin, cleans up in finally. Tokens = 0 (counted in the Claude Code session).

provider-runner.ts — ProviderRunner supporting all four configured providers (OpenAI, Anthropic, Gemini, Ollama). Inlines provider dispatch in CLI to avoid circular dependency (@code-insights/server depends on @code-insights/cli). All providers use only Node.js built-in fetch — no external SDK dependencies.

schemas/session-analysis.json — Flat JSON schema for claude -p --json-schema, derived from AnalysisResponse in prompt-types.ts.

schemas/prompt-quality.json — Flat JSON schema for claude -p --json-schema, derived from PromptQualityResponse in prompt-types.ts.

schemas/__tests__/schema-sync.test.ts — Validates JSON schema required properties match TypeScript interface fields. Fails in CI if they diverge.

Schema Impact

SQLite schema changed: no
Types changed: no (new files only)
Server API changed: no
Backward compatible: yes

Testing

pnpm build  # PASS — zero errors
pnpm test   # PASS — 966 tests, 46 test files

New tests added:

native-runner.test.ts — 10 tests (validate, execFileSync args, cleanup in finally, schema flag, result shape)
provider-runner.test.ts — 9 tests (fromConfig validation, OpenAI/Anthropic dispatch, error handling, jsonSchema ignored)
schemas/__tests__/schema-sync.test.ts — 8 tests (required field coverage for both schemas)

Closes #239

Note on ProviderRunner Design

The plan noted "if circular dep is an issue, document it in the PR." The circular dep was confirmed: server depends on cli, so cli cannot import @code-insights/server. Resolution: ProviderRunner inlines the provider dispatch (mirrors server/src/llm/client.ts). All providers use only fetch with no SDK dependencies, so the inline is 4 small functions totaling ~100 lines. Server LLM client (server/src/llm/client.ts) continues to be the source used by the dashboard API. Issue #240 can evaluate consolidating if this becomes a maintenance concern.

Defines the abstraction layer between the insights command and LLM backends. Adding a future runner (e.g. CursorNativeRunner) requires only implementing this interface — no changes to the calling code. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Executes analysis via `claude -p` non-interactive mode. Uses execFileSync (not exec) to prevent shell injection. Temp files cleaned up in finally block. Tokens are 0 — counted as part of the overall Claude Code session. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Delegates analysis to the configured provider (OpenAI, Anthropic, Gemini, Ollama). Inlines provider dispatch in CLI to avoid a circular dependency with the server package (server -> cli). All providers use only Node.js built-in fetch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…prompt-quality Flat JSON schemas for claude -p --json-schema structured output. Derived from AnalysisResponse and PromptQualityResponse in prompt-types.ts. Schema sync test validates required properties match TypeScript types. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds ./analysis/runner-types, ./analysis/native-runner, ./analysis/provider-runner, and schema exports. Updates build script to copy JSON schema files to dist/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

melagiri · 2026-03-29T04:43:08Z

TA Synthesis (Phase 2): Runner Interface — Round 1

Review of Domain Specialist Comments

Specialist FIX NOW: claude -p --output-format json returns event stream envelope

Verdict: AGREE — CONFIRMED VIA RUNTIME VERIFICATION.

I ran claude -p --output-format json --bare against the actual Claude CLI and confirmed the output is a JSON array of event objects:

[
  {"type": "system", "subtype": "init", "session_id": "...", "tools": [...], ...},
  {"type": "assistant", "message": {"content": [{"type": "text", "text": "...actual LLM output..."}]}, ...},
  {"type": "result", "subtype": "success", "result": "...actual LLM output...", "is_error": false, ...}
]

The native runner currently returns this entire array as rawJson. Downstream consumers expect the raw analysis JSON (e.g., {"summary": {...}, "character": {...}, ...}). This is a breaking bug — the runner will produce unparseable results.

Recommended fix: Parse the JSON envelope and extract the result text. Two options:

Approach	Pros	Cons
A: Switch to `--output-format text`	Simplest change (1 line)	Loses structured error detection (`is_error`, `billing_error`, `stop_reason`)
B: Keep `json`, parse envelope	Enables proper error handling (detect billing errors, timeouts, stop reasons); future-proof	~15 lines of parsing logic

I recommend Option B. The envelope contains is_error, stop_reason, and error details that would let the runner throw specific errors (e.g., "Claude billing error" vs "analysis timeout" vs "unexpected stop"). This is especially valuable since execFileSync won't throw on these — Claude exits 0 even on billing errors.

Concrete extraction path: events.find(e => e.type === 'result')?.result for the text, with is_error check first.

Consolidated Review (For Dev Agent)

FIX NOW:

[CRITICAL] Native runner must parse --output-format json event envelope. The claude -p --output-format json output is a JSON array of event objects, not raw LLM text. The runner must:
- Parse the JSON array
- Find the result event (type === 'result')
- Check is_error — if true, throw with the error message from result.result
- If success, return result.result as rawJson
- Update tests to mock the envelope format (current tests mock raw JSON strings — they should mock the full envelope array)

NOT APPLICABLE:

TA Phase 1: ts/start variables redundant — ts is used for temp file naming, start for duration calculation. Different semantic purposes even though same value. Style preference, not a fix.
TA Phase 1: outcome_satisfaction lacks enum constraint in JSON schema — Constraint is enforced at parse time in response-parsers.ts, which is the correct layer. Enum in schema would cause validation failures for edge-case model responses.

SUGGESTIONS (non-blocking):

Temp file name collision — Add random suffix: ci-prompt-${Date.now()}-${randomSuffix}.txt
Build script require() fragility — Consider cp command instead of node -e with require()
Missing Gemini/Ollama dispatch tests — Different response shapes warrant coverage
Comment on narrower LLMMessage.content type — Explain string-only is intentional (no cache support in CLI)
Schema sync test hardcodes field lists — Pragmatic, documented. No change needed.
Gemini API key in URL — Matches Google's design. Informational only.

Final Verdict

CHANGES REQUIRED — Round 2 needed.

The FIX NOW item is a confirmed runtime bug (verified against actual Claude CLI output). The native runner will silently produce the entire event envelope instead of the analysis JSON. Must be fixed before merge.

claude -p --output-format json returns a JSON array of typed events. The actual LLM text lives in the result event's result field, not in the raw output. Extract it via extractResultFromEnvelope(), check is_error, and throw with a clear message on claude-level failures. Also adds random suffix to temp file names to prevent concurrent collisions. Tests updated to mock the envelope format throughout. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds coverage for all four configured providers. Also adds a comment explaining why LLMMessage.content is intentionally narrower (string only) than the server type that allows ContentBlock[]. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

melagiri · 2026-03-29T04:46:10Z

Review Addressal

FIX NOW items addressed:

Native runner event envelope parsing → Fixed: extractResultFromEnvelope() parses the claude -p --output-format json array, finds the result event, checks is_error, and returns result.result as rawJson. Throws with clear messages for non-JSON output, missing result event, and is_error: true. All native-runner tests updated to mock the envelope format.

Non-blocking suggestions addressed:

Temp file collision → Fixed: file ID is now ${Date.now()}-${Math.random().toString(36).slice(2, 8)}. Test asserts two concurrent calls get different paths.
Gemini + Ollama tests → Added: ProviderRunner.runAnalysis() — Gemini (2 tests) and ProviderRunner.runAnalysis() — Ollama (2 tests). All four providers now have dispatch coverage.
Content type comment → Added: explains content: string is intentionally narrower than server/src/llm/types.ts LLMMessage (which allows ContentBlock[]).

Pre-PR gate: pnpm build passing, pnpm test passing — 975 tests (up from 966), 46 files, zero failures.

All review items addressed. Ready for re-review or merge.

melagiri · 2026-03-29T04:46:47Z

PM Review Summary — PR #246 Ready for Merge

What Shipped

AnalysisRunner abstraction with two implementations, JSON schemas, and full test coverage:

File	Purpose
`cli/src/analysis/runner-types.ts`	`AnalysisRunner` interface + `RunAnalysisParams` / `RunAnalysisResult`
`cli/src/analysis/native-runner.ts`	`ClaudeNativeRunner` — `execFileSync`, envelope parsing, temp file cleanup
`cli/src/analysis/provider-runner.ts`	`ProviderRunner` — all 4 providers (OpenAI, Anthropic, Gemini, Ollama)
`cli/src/analysis/schemas/session-analysis.json`	Flat JSON Schema from `AnalysisResponse`
`cli/src/analysis/schemas/prompt-quality.json`	Flat JSON Schema from `PromptQualityResponse`
`cli/src/analysis/schemas/__tests__/schema-sync.test.ts`	Schema ↔ TypeScript property sync (8 tests)
`cli/src/analysis/native-runner.test.ts`	Unit tests for `ClaudeNativeRunner`
`cli/src/analysis/provider-runner.test.ts`	Unit tests for `ProviderRunner` (all 4 providers)

Review Findings Addressed

Critical fix caught by review: ClaudeNativeRunner was returning the raw claude -p --output-format json event envelope instead of the LLM text. Confirmed against actual Claude CLI output — the response is a JSON array of event objects. Fixed via extractResultFromEnvelope() which finds the result event, checks is_error, and returns result.result.

Non-blocking suggestions addressed: random temp file suffix, Gemini+Ollama dispatch tests (4 new tests), content type comment on ProviderRunner.

Design Note on Record

ProviderRunner inlines provider dispatch in CLI (does not import from @code-insights/server) due to confirmed circular dependency (server → cli). Issue #240 tracks potential future consolidation. This was reviewed and accepted.

Gate Status

Build: PASS (zero errors)
Tests: PASS (975 tests, 46 files, zero failures — up from 966 before this PR)
Triple-layer review: complete, all FIX NOW items resolved

This PR is ready for founder merge.

melagiri · 2026-03-29T04:59:12Z

Triple-Layer Code Review — Round 2 of 2

Reviewers

Role	Domain	Round 1	Round 2
TA (Insider)	Architecture, types, schema	Active — APPROVED	Skipped (no blocking items)
Node/CLI Specialist	CLI, providers, exec safety	Active — REQUEST CHANGES	Active — PASS
LLM Expert	Prompt quality, token efficiency	Skipped (no LLM changes)	Skipped

Pre-Review Gates

New dependency audit: N/A (no new deps — uses built-in fetch)
Functional verification evidence: Build PASS, 966 tests PASS
Visual output attached: N/A

Round 1 Issues & Resolution

🔴 FIX NOW (Round 1)

Native runner must parse claude -p --output-format json event envelope — claude -p returns a JSON array of events, not raw LLM text. Runner was returning the entire envelope as rawJson.

Resolution (Round 2): ✅ FIXED. extractResultFromEnvelope() function added. Parses JSON array, finds result event via type guard, checks is_error, returns result.result. Full test coverage for all error paths (malformed JSON, missing result event, is_error:true).

🟡 SUGGESTIONS (Round 1)

Temp file collision risk → ✅ ADDRESSED (random suffix added)
Missing Gemini/Ollama tests → ✅ ADDRESSED (4 new tests)
Comment on narrower LLMMessage.content → ✅ ADDRESSED

🔵 NOTES

ProviderRunner mirrors server LLM client (intentional, documented, Issue feat: insights CLI command with --native and --hook modes #240 for consolidation)
temperature: 0.7 hardcoded (future tuning concern, not blocking)
Schema sync test hardcodes field lists (pragmatic, documented)

Round 2 Verdict

PASS — Ready for merge. All Round 1 blocking items resolved. All suggestions addressed. No new issues found in Round 2. Envelope parsing is correct with comprehensive error handling.

🤖 Generated with Claude Code

… schema-sync native-runner: JSON-object-not-array, empty array, error_max_turns subtype provider-runner: Anthropic system message extraction, missing usage defaults to 0, unknown provider throws schema-sync: friction_points item required fields, effective_patterns item required fields, findings item required fields Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

melagiri · 2026-03-29T05:11:54Z

Coverage Gap Addressal

9 new tests added across 3 files (975 → 984 passing):

native-runner.test.ts (+3):

JSON object (not array) from claude -p → throws /not an array/
Empty event array [] → throws /no result event/
error_max_turns subtype with is_error: true → throws /claude -p reported an error.*Max turns/

provider-runner.test.ts (+3):

Anthropic system message extracted to body.system, not in messages array
Missing usage field in OpenAI response → inputTokens: 0, outputTokens: 0
Unknown provider string → throws /Unknown LLM provider/

schema-sync.test.ts (+3):

friction_points items: required contains category, description, severity, resolution
effective_patterns items: required contains category, description, confidence
findings items: required contains category, type, description, message_ref, impact, confidence

Pre-PR gate: pnpm build PASS, pnpm test PASS — 984 tests, 46 files, zero failures.

melagiri and others added 5 commits March 29, 2026 10:04

melagiri and others added 2 commits March 29, 2026 10:15

melagiri merged commit c6d59e0 into master Mar 29, 2026
2 checks passed

melagiri deleted the feature/runner-interface branch March 29, 2026 05:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: AnalysisRunner interface + ClaudeNativeRunner + ProviderRunner#246

feat: AnalysisRunner interface + ClaudeNativeRunner + ProviderRunner#246
melagiri merged 8 commits intomasterfrom
feature/runner-interface

melagiri commented Mar 29, 2026

Uh oh!

melagiri commented Mar 29, 2026

Uh oh!

melagiri commented Mar 29, 2026

Uh oh!

melagiri commented Mar 29, 2026

Uh oh!

melagiri commented Mar 29, 2026

Uh oh!

melagiri commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

melagiri commented Mar 29, 2026

What

Why

How

Schema Impact

Testing

Note on ProviderRunner Design

Uh oh!

melagiri commented Mar 29, 2026

TA Synthesis (Phase 2): Runner Interface — Round 1

Review of Domain Specialist Comments

Consolidated Review (For Dev Agent)

Final Verdict

Uh oh!

melagiri commented Mar 29, 2026

Review Addressal

Uh oh!

melagiri commented Mar 29, 2026

PM Review Summary — PR #246 Ready for Merge

What Shipped

Review Findings Addressed

Design Note on Record

Gate Status

Uh oh!

melagiri commented Mar 29, 2026

Triple-Layer Code Review — Round 2 of 2

Reviewers

Pre-Review Gates

Round 1 Issues & Resolution

🔴 FIX NOW (Round 1)

🟡 SUGGESTIONS (Round 1)

🔵 NOTES

Round 2 Verdict

Uh oh!

melagiri commented Mar 29, 2026

Coverage Gap Addressal

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant