Skip to content

feat: AnalysisRunner interface + ClaudeNativeRunner + ProviderRunner#246

Merged
melagiri merged 8 commits intomasterfrom
feature/runner-interface
Mar 29, 2026
Merged

feat: AnalysisRunner interface + ClaudeNativeRunner + ProviderRunner#246
melagiri merged 8 commits intomasterfrom
feature/runner-interface

Conversation

@melagiri
Copy link
Copy Markdown
Owner

What

Implements the runner abstraction layer for Issue #239 (Phase 12, v4.8.0).

Why

The upcoming insights CLI command needs to run analysis via two backends (claude -p native and configured LLM provider) without the command code caring which is used. This PR delivers the interface and both implementations.

How

runner-types.tsAnalysisRunner interface with RunAnalysisParams / RunAnalysisResult. Adding a new runner (e.g. CursorNativeRunner) means implementing this interface only.

native-runner.tsClaudeNativeRunner using execFileSync (not exec — shell injection prevention). Writes system prompt and JSON schema to temp files, pipes conversation via stdin, cleans up in finally. Tokens = 0 (counted in the Claude Code session).

provider-runner.tsProviderRunner supporting all four configured providers (OpenAI, Anthropic, Gemini, Ollama). Inlines provider dispatch in CLI to avoid circular dependency (@code-insights/server depends on @code-insights/cli). All providers use only Node.js built-in fetch — no external SDK dependencies.

schemas/session-analysis.json — Flat JSON schema for claude -p --json-schema, derived from AnalysisResponse in prompt-types.ts.

schemas/prompt-quality.json — Flat JSON schema for claude -p --json-schema, derived from PromptQualityResponse in prompt-types.ts.

schemas/__tests__/schema-sync.test.ts — Validates JSON schema required properties match TypeScript interface fields. Fails in CI if they diverge.

Schema Impact

  • SQLite schema changed: no
  • Types changed: no (new files only)
  • Server API changed: no
  • Backward compatible: yes

Testing

pnpm build  # PASS — zero errors
pnpm test   # PASS — 966 tests, 46 test files

New tests added:

  • native-runner.test.ts — 10 tests (validate, execFileSync args, cleanup in finally, schema flag, result shape)
  • provider-runner.test.ts — 9 tests (fromConfig validation, OpenAI/Anthropic dispatch, error handling, jsonSchema ignored)
  • schemas/__tests__/schema-sync.test.ts — 8 tests (required field coverage for both schemas)

Closes #239

Note on ProviderRunner Design

The plan noted "if circular dep is an issue, document it in the PR." The circular dep was confirmed: server depends on cli, so cli cannot import @code-insights/server. Resolution: ProviderRunner inlines the provider dispatch (mirrors server/src/llm/client.ts). All providers use only fetch with no SDK dependencies, so the inline is 4 small functions totaling ~100 lines. Server LLM client (server/src/llm/client.ts) continues to be the source used by the dashboard API. Issue #240 can evaluate consolidating if this becomes a maintenance concern.

melagiri and others added 5 commits March 29, 2026 10:04
Defines the abstraction layer between the insights command and LLM backends.
Adding a future runner (e.g. CursorNativeRunner) requires only implementing
this interface — no changes to the calling code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Executes analysis via `claude -p` non-interactive mode. Uses execFileSync
(not exec) to prevent shell injection. Temp files cleaned up in finally block.
Tokens are 0 — counted as part of the overall Claude Code session.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Delegates analysis to the configured provider (OpenAI, Anthropic, Gemini, Ollama).
Inlines provider dispatch in CLI to avoid a circular dependency with the server
package (server -> cli). All providers use only Node.js built-in fetch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…prompt-quality

Flat JSON schemas for claude -p --json-schema structured output.
Derived from AnalysisResponse and PromptQualityResponse in prompt-types.ts.
Schema sync test validates required properties match TypeScript types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds ./analysis/runner-types, ./analysis/native-runner, ./analysis/provider-runner,
and schema exports. Updates build script to copy JSON schema files to dist/.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@melagiri
Copy link
Copy Markdown
Owner Author

TA Synthesis (Phase 2): Runner Interface — Round 1

Review of Domain Specialist Comments

Specialist FIX NOW: claude -p --output-format json returns event stream envelope

Verdict: AGREE — CONFIRMED VIA RUNTIME VERIFICATION.

I ran claude -p --output-format json --bare against the actual Claude CLI and confirmed the output is a JSON array of event objects:

[
  {"type": "system", "subtype": "init", "session_id": "...", "tools": [...], ...},
  {"type": "assistant", "message": {"content": [{"type": "text", "text": "...actual LLM output..."}]}, ...},
  {"type": "result", "subtype": "success", "result": "...actual LLM output...", "is_error": false, ...}
]

The native runner currently returns this entire array as rawJson. Downstream consumers expect the raw analysis JSON (e.g., {"summary": {...}, "character": {...}, ...}). This is a breaking bug — the runner will produce unparseable results.

Recommended fix: Parse the JSON envelope and extract the result text. Two options:

Approach Pros Cons
A: Switch to --output-format text Simplest change (1 line) Loses structured error detection (is_error, billing_error, stop_reason)
B: Keep json, parse envelope Enables proper error handling (detect billing errors, timeouts, stop reasons); future-proof ~15 lines of parsing logic

I recommend Option B. The envelope contains is_error, stop_reason, and error details that would let the runner throw specific errors (e.g., "Claude billing error" vs "analysis timeout" vs "unexpected stop"). This is especially valuable since execFileSync won't throw on these — Claude exits 0 even on billing errors.

Concrete extraction path: events.find(e => e.type === 'result')?.result for the text, with is_error check first.


Consolidated Review (For Dev Agent)

FIX NOW:

  1. [CRITICAL] Native runner must parse --output-format json event envelope. The claude -p --output-format json output is a JSON array of event objects, not raw LLM text. The runner must:
    • Parse the JSON array
    • Find the result event (type === 'result')
    • Check is_error — if true, throw with the error message from result.result
    • If success, return result.result as rawJson
    • Update tests to mock the envelope format (current tests mock raw JSON strings — they should mock the full envelope array)

NOT APPLICABLE:

  1. TA Phase 1: ts/start variables redundantts is used for temp file naming, start for duration calculation. Different semantic purposes even though same value. Style preference, not a fix.

  2. TA Phase 1: outcome_satisfaction lacks enum constraint in JSON schema — Constraint is enforced at parse time in response-parsers.ts, which is the correct layer. Enum in schema would cause validation failures for edge-case model responses.

SUGGESTIONS (non-blocking):

  1. Temp file name collision — Add random suffix: ci-prompt-${Date.now()}-${randomSuffix}.txt
  2. Build script require() fragility — Consider cp command instead of node -e with require()
  3. Missing Gemini/Ollama dispatch tests — Different response shapes warrant coverage
  4. Comment on narrower LLMMessage.content type — Explain string-only is intentional (no cache support in CLI)
  5. Schema sync test hardcodes field lists — Pragmatic, documented. No change needed.
  6. Gemini API key in URL — Matches Google's design. Informational only.

Final Verdict

CHANGES REQUIRED — Round 2 needed.

The FIX NOW item is a confirmed runtime bug (verified against actual Claude CLI output). The native runner will silently produce the entire event envelope instead of the analysis JSON. Must be fixed before merge.

melagiri and others added 2 commits March 29, 2026 10:15
claude -p --output-format json returns a JSON array of typed events.
The actual LLM text lives in the result event's result field, not in the
raw output. Extract it via extractResultFromEnvelope(), check is_error,
and throw with a clear message on claude-level failures.

Also adds random suffix to temp file names to prevent concurrent collisions.
Tests updated to mock the envelope format throughout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds coverage for all four configured providers. Also adds a comment
explaining why LLMMessage.content is intentionally narrower (string only)
than the server type that allows ContentBlock[].

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@melagiri
Copy link
Copy Markdown
Owner Author

Review Addressal

FIX NOW items addressed:

  1. Native runner event envelope parsing → Fixed: extractResultFromEnvelope() parses the claude -p --output-format json array, finds the result event, checks is_error, and returns result.result as rawJson. Throws with clear messages for non-JSON output, missing result event, and is_error: true. All native-runner tests updated to mock the envelope format.

Non-blocking suggestions addressed:

  1. Temp file collision → Fixed: file ID is now ${Date.now()}-${Math.random().toString(36).slice(2, 8)}. Test asserts two concurrent calls get different paths.
  2. Gemini + Ollama tests → Added: ProviderRunner.runAnalysis() — Gemini (2 tests) and ProviderRunner.runAnalysis() — Ollama (2 tests). All four providers now have dispatch coverage.
  3. Content type comment → Added: explains content: string is intentionally narrower than server/src/llm/types.ts LLMMessage (which allows ContentBlock[]).

Pre-PR gate: pnpm build passing, pnpm test passing — 975 tests (up from 966), 46 files, zero failures.

All review items addressed. Ready for re-review or merge.

@melagiri
Copy link
Copy Markdown
Owner Author

PM Review Summary — PR #246 Ready for Merge

What Shipped

AnalysisRunner abstraction with two implementations, JSON schemas, and full test coverage:

File Purpose
cli/src/analysis/runner-types.ts AnalysisRunner interface + RunAnalysisParams / RunAnalysisResult
cli/src/analysis/native-runner.ts ClaudeNativeRunnerexecFileSync, envelope parsing, temp file cleanup
cli/src/analysis/provider-runner.ts ProviderRunner — all 4 providers (OpenAI, Anthropic, Gemini, Ollama)
cli/src/analysis/schemas/session-analysis.json Flat JSON Schema from AnalysisResponse
cli/src/analysis/schemas/prompt-quality.json Flat JSON Schema from PromptQualityResponse
cli/src/analysis/schemas/__tests__/schema-sync.test.ts Schema ↔ TypeScript property sync (8 tests)
cli/src/analysis/native-runner.test.ts Unit tests for ClaudeNativeRunner
cli/src/analysis/provider-runner.test.ts Unit tests for ProviderRunner (all 4 providers)

Review Findings Addressed

Critical fix caught by review: ClaudeNativeRunner was returning the raw claude -p --output-format json event envelope instead of the LLM text. Confirmed against actual Claude CLI output — the response is a JSON array of event objects. Fixed via extractResultFromEnvelope() which finds the result event, checks is_error, and returns result.result.

Non-blocking suggestions addressed: random temp file suffix, Gemini+Ollama dispatch tests (4 new tests), content type comment on ProviderRunner.

Design Note on Record

ProviderRunner inlines provider dispatch in CLI (does not import from @code-insights/server) due to confirmed circular dependency (server → cli). Issue #240 tracks potential future consolidation. This was reviewed and accepted.

Gate Status

  • Build: PASS (zero errors)
  • Tests: PASS (975 tests, 46 files, zero failures — up from 966 before this PR)
  • Triple-layer review: complete, all FIX NOW items resolved

This PR is ready for founder merge.

@melagiri
Copy link
Copy Markdown
Owner Author

Triple-Layer Code Review — Round 2 of 2

Reviewers

Role Domain Round 1 Round 2
TA (Insider) Architecture, types, schema Active — APPROVED Skipped (no blocking items)
Node/CLI Specialist CLI, providers, exec safety Active — REQUEST CHANGES Active — PASS
LLM Expert Prompt quality, token efficiency Skipped (no LLM changes) Skipped

Pre-Review Gates

  • New dependency audit: N/A (no new deps — uses built-in fetch)
  • Functional verification evidence: Build PASS, 966 tests PASS
  • Visual output attached: N/A

Round 1 Issues & Resolution

🔴 FIX NOW (Round 1)

  1. Native runner must parse claude -p --output-format json event envelopeclaude -p returns a JSON array of events, not raw LLM text. Runner was returning the entire envelope as rawJson.

    Resolution (Round 2): ✅ FIXED. extractResultFromEnvelope() function added. Parses JSON array, finds result event via type guard, checks is_error, returns result.result. Full test coverage for all error paths (malformed JSON, missing result event, is_error:true).

🟡 SUGGESTIONS (Round 1)

  1. Temp file collision risk → ✅ ADDRESSED (random suffix added)
  2. Missing Gemini/Ollama tests → ✅ ADDRESSED (4 new tests)
  3. Comment on narrower LLMMessage.content → ✅ ADDRESSED

🔵 NOTES

Round 2 Verdict

PASS — Ready for merge. All Round 1 blocking items resolved. All suggestions addressed. No new issues found in Round 2. Envelope parsing is correct with comprehensive error handling.

🤖 Generated with Claude Code

… schema-sync

native-runner: JSON-object-not-array, empty array, error_max_turns subtype
provider-runner: Anthropic system message extraction, missing usage defaults to 0,
  unknown provider throws
schema-sync: friction_points item required fields, effective_patterns item required
  fields, findings item required fields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@melagiri
Copy link
Copy Markdown
Owner Author

Coverage Gap Addressal

9 new tests added across 3 files (975 → 984 passing):

native-runner.test.ts (+3):

  • JSON object (not array) from claude -p → throws /not an array/
  • Empty event array [] → throws /no result event/
  • error_max_turns subtype with is_error: true → throws /claude -p reported an error.*Max turns/

provider-runner.test.ts (+3):

  • Anthropic system message extracted to body.system, not in messages array
  • Missing usage field in OpenAI response → inputTokens: 0, outputTokens: 0
  • Unknown provider string → throws /Unknown LLM provider/

schema-sync.test.ts (+3):

  • friction_points items: required contains category, description, severity, resolution
  • effective_patterns items: required contains category, description, confidence
  • findings items: required contains category, type, description, message_ref, impact, confidence

Pre-PR gate: pnpm build PASS, pnpm test PASS — 984 tests, 46 files, zero failures.

@melagiri melagiri merged commit c6d59e0 into master Mar 29, 2026
2 checks passed
@melagiri melagiri deleted the feature/runner-interface branch March 29, 2026 05:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: AnalysisRunner interface + ClaudeNativeRunner + ProviderRunner

1 participant