feat(graphadapter): Graph Runtime Governance — Framework-Agnostic Control Plane (ADR-003) by sergeyenin · Pull Request #76 · dativo-io/talon

sergeyenin · 2026-04-01T16:50:10Z

Summary

Implements ADR-003: a framework-agnostic event contract (/v1/graph/events) that lets external agent runtimes (LangGraph, LangChain, custom) send governance events to Talon and receive control decisions (allow/deny) with evidence lineage
Adds graph-aware OPA policy engine (graph_governance.rego) enforcing max_iterations, max_cost_per_run, max_retries_per_node, and tool allowlists at every lifecycle point
Introduces deterministic explanation facts for graph evidence (GRAPH_RUN_ALLOWED, GRAPH_ITERATION_LIMIT_DENY, etc.), in-memory run state accumulator that reflects mid-run denials on run_end evidence, and full lineage fields (GraphRunID, PlanID) populated on both evidence and step_evidence SQL records
Includes Python SDK (talon_sdk.py), examples for stateful LangGraph and stateless LangChain, Jupyter/Colab-ready notebook, and comprehensive integration docs with auth requirements

Changes

Core implementation (commit 1)

internal/agent/graphadapter/ — Adapter, Handler, Event/Decision contracts, OPA + tool access policy integration
internal/policy/engine.go — EvaluateGraphGovernance method
internal/policy/rego/graph_governance.rego — Graph-specific Rego deny rules
internal/evidence/ — GraphRunID/PlanID fields on structs, graph_summaries table, SQL schema migration
internal/server/server.go — /v1/graph/events endpoint registration under tenant auth

Delta improvements (commit 2)

D1: Populate GraphRunID/PlanID on SQL INSERT for evidence + step_evidence
D2: Graph-specific deterministic explanation fact codes
D4: In-memory run state accumulator; run_end reflects mid-run denials
D5: Decision.EvidenceID populated on all responses
D6: Smoke test auth fix (Authorization: Bearer)
D7: 14 OPA Rego test rules for graph_governance.rego
D9/D11: Auth + reserved action documentation

Tests

Unit: 30 tests in graphadapter (adapter, handler, policy, evidence ID, lineage)
Rego: 14 OPA test rules covering all deny/allow paths
Integration: 9 tests (full lifecycle, tool deny, iteration/cost/retry limits, denied run evidence, tenant isolation, HTTP validation)
Smoke: Section 30 — 14 sub-tests exercising /v1/graph/events over HTTP with Bearer auth

Test plan

make test-all — all unit + integration + e2e tests pass
make lint — 0 issues
make check — all green
opa test — 14/14 Rego tests pass
Run smoke tests end-to-end: bash tests/smoke_test.sh
Manual: curl lifecycle against a running talon serve instance

Note

Medium Risk
Adds a new authenticated runtime-control endpoint (/v1/graph/events) plus new OPA policy evaluation and evidence schema changes; failures could impact external orchestrator integrations and evidence storage/migrations.

Overview
Introduces a framework-agnostic graph runtime governance control plane: external runtimes can POST lifecycle events to the new POST /v1/graph/events endpoint and receive synchronous allow/deny decisions, with adapter logic recording step/run evidence and carrying forward mid-run denials to the final run_end evidence.

Extends policy and evidence layers to support graph runs: adds graph_governance.rego + EvaluateGraphGovernance enforcing iteration/cost/retry limits (and tool allowlist via existing tool access checks), adds plan_id/graph_run_id lineage fields to evidence records with new indexes and a new signed graph_summaries table, and expands deterministic explanation codes for graph outcomes.

Adds comprehensive coverage and integration material: unit/integration tests for event handling and policy limits, a new smoke test section for graph events, and Python reference client/examples plus integration docs and ADR/roadmap.

^{Written by Cursor Bugbot for commit 4c37a6b. Configure here.}

…ull test pyramid Introduce a framework-agnostic /v1/graph/events HTTP endpoint that lets external agent runtimes (LangGraph, LangChain, OpenAI SDK, etc.) emit governance events and receive synchronous control decisions from Talon's policy engine. Core additions: - graphadapter package: event contract, decision schema, adapter, HTTP handler - graph_governance.rego: OPA policy for max_iterations, max_cost_per_run, max_retries_per_node enforcement on graph events - EvaluateGraphGovernance method on policy.Engine - Evidence lineage: plan_id, graph_run_id on Evidence/StepEvidence, GraphSummary model with dedicated table - Server wiring via WithGraphEventsHandler option Testing (full pyramid): - Unit: 11 policy-wired tests + 9 original adapter/handler tests (88.4% coverage) - Integration: 7 Go tests covering full HTTP lifecycle, tool deny, iteration/cost/ retry limits, HTTP validation, and tenant isolation - Smoke: section 30 (14 sub-tests) exercising the live endpoint with google_search as the only tool, including policy deny paths and evidence verification Documentation: - ADR-003: graph runtime governance decision record - GRAPH_GOVERNANCE_ROADMAP.md: 3-week phased implementation plan - docs/integration/langchain-langgraph.md: integration guide - Python SDK + examples (LangGraph, LangChain, notebook)

…s, denied run-end, deep tests Implements the full ADR-003 delta backlog: PR1 Immediate Patch: - D1: Populate GraphRunID/PlanID on evidence and step_evidence SQL INSERTs - D5: Populate Decision.EvidenceID on all event handler responses - D6: Fix smoke test section 30 to use Authorization: Bearer auth - D7: Add graph_governance_test.rego with 14 OPA test rules - D9: Document /v1/graph/events auth requirement in integration docs - D11: Mark reserved actions in decision.go and integration docs PR2 Next Milestone: - D2: Add graph-specific explanation fact codes (GRAPH_RUN_ALLOWED, etc.) and build deterministic ExplanationFacts on run_end evidence - D4: Add in-memory run state accumulator; run_end evidence now reflects mid-run denials with PolicyDecision.Allowed=false and FailureReason - D10: Deep field-level evidence assertions in integration tests, including new TestGraphAdapter_DeniedRun_EvidenceReflectsDenial

sergeyenin · 2026-04-01T18:09:04Z

@cursor review

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix prepared fixes for all 3 issues found in the latest run.

✅ Fixed: Race condition on shared runState fields
- Added sync.Mutex to runState and wrapped trackDenial mutations and consumeRunState reads with lock/unlock to prevent concurrent goroutine races on denied and reasons fields.
✅ Fixed: Missing MaxRetriesPerNode field makes config ineffective
- Added MaxRetriesPerNode int field with json:"max_retries_per_node" tag to ResourceLimitsConfig so the value is serialized into OPA data and the Rego policy can read it instead of always falling back to the default of 3.
✅ Fixed: Adapter evidenceStore field is set but never read
- Removed the unused evidenceStore field from the Adapter struct and changed the NewAdapter parameter to _ to maintain API compatibility while eliminating dead code.

Preview (e956e72d13)

diff --git a/docs/contributor/GRAPH_GOVERNANCE_ROADMAP.md b/docs/contributor/GRAPH_GOVERNANCE_ROADMAP.md
new file mode 100644
--- /dev/null
+++ b/docs/contributor/GRAPH_GOVERNANCE_ROADMAP.md
@@ -1,0 +1,123 @@
+# Graph Runtime Governance — Implementation Roadmap
+
+**Status:** Active  
+**ADR:** [ADR-003](adr/ADR-003-graph-runtime-governance.md)  
+**Target:** Level 4 (Plan-aware) with partial Level 5 (Control plane) capabilities
+
+---
+
+## Week 1 — Runtime Contract + Evidence Lineage
+
+### Deliverables
+
+1. **Graph adapter package** (`internal/agent/graphadapter/`)
+   - Event types: `run_start`, `step_start`, `step_end`, `tool_call`, `retry`, `run_end`
+   - Decision types: `allow`, `deny`, `abort`, `override_model`, `mutate_args`, `require_review`
+   - Adapter bridges events to policy engine and evidence store
+   - HTTP handler for `/v1/graph/events` endpoint
+
+2. **Evidence lineage fields**
+   - `plan_id` and `graph_run_id` on `Evidence`, `StepEvidence`, `GenerateParams`, `StepParams`
+   - `graph_summaries` table with HMAC-signed summary records
+   - Indexes for graph_run_id lookups
+
+3. **Transport contract**
+   - Same JSON payload schema for notebook and standalone usage
+   - Tenant authentication via existing `TenantKeyMiddleware`
+   - Timeout/fail-closed behavior inherited from server config
+
+### Test Strategy
+
+- Unit tests for adapter (all event types, nil policy engine, nil evidence)
+- Unit tests for HTTP handler (validation, error cases)
+- Evidence package build verification (schema migration, new columns)
+
+### Checkpoint
+
+- `go build ./...` succeeds
+- `go test ./internal/agent/graphadapter/... ./internal/policy/... ./internal/evidence/...` passes
+- `/v1/graph/events` endpoint accepts POST, returns decisions
+
+---
+
+## Week 2 — Policy Control Surface + Reference Wrappers
+
+### Deliverables
+
+1. **Graph governance Rego policy** (`rego/graph_governance.rego`)
+   - Step count limits (reuses `max_iterations`)
+   - Cost accumulation limits (reuses `max_cost_per_run`)
+   - Retry governance (`max_retries_per_node`, default 3)
+
+2. **`EvaluateGraphGovernance`** method on policy engine
+   - Input: event_type, step_index, retry_count, cost_so_far, node_id
+   - Output: allow/deny with reasons
+
+3. **Python reference wrappers** (`examples/`)
+   - LangGraph callback adapter
+   - LangChain stateless base-URL example
+   - Notebook-ready snippets
+
+4. **Draft integration docs**
+
+### Test Strategy
+
+- Policy engine tests with graph governance scenarios
+- Integration test: full event sequence through HTTP handler
+- Python examples verified manually (documented expected output)
+
+### Checkpoint
+
+- Graph governance policy fires correctly for over-limit scenarios
+- Reference wrappers demonstrate both LangGraph and LangChain paths
+- Docs draft reviewed
+
+---
+
+## Week 3 — Plan-Aware Governance + Hardening
+
+### Deliverables
+
+1. **Plan review integration with graph runs**
+   - `ProposedSteps` populated from `run_start` event's `planned_steps`
+   - Plan gate can hold graph execution pending approval
+   - Evidence links plan approval -> graph execution -> steps
+
+2. **Step/node-level approval triggers**
+   - High-risk nodes (by policy config) trigger `require_review` decision
+   - Tool approval store integration for graph tool calls
+
+3. **End-to-end tests**
+   - Retry runaway scenario (node retries past limit -> abort)
+   - Budget runaway scenario (cost accumulates past limit -> abort)
+   - Multi-step graph with plan review gate
+   - Notebook session with restart/reconnect
+   - Standalone worker with process restart
+
+4. **Final documentation**
+   - Tested, copy-paste-ready code snippets
+   - "Other supported patterns" section (OpenAI SDK, MCP)
+   - When-to-use guidance
+
+### Test Strategy
+
+- E2E tests in `tests/integration/`
+- Smoke test sections for graph governance
+- Manual verification of Python examples against running `talon serve`
+
+### Checkpoint
+
+- Full event lifecycle: plan -> approve -> run_start -> steps -> run_end with linked evidence
+- Budget/retry abort decisions enforced
+- Docs finalized with no pseudocode
+
+---
+
+## Success Criteria
+
+- [ ] External runtimes emit events to `/v1/graph/events` and receive control decisions
+- [ ] Evidence audit trail links plan_id -> graph_run_id -> correlation_id -> steps
+- [ ] Policy can deny based on step count, cost, and retry limits
+- [ ] Tool access control applies to graph tool_call events
+- [ ] Documentation covers LangGraph, LangChain stateless, OpenAI SDK, and MCP patterns
+- [ ] Both notebook and standalone app modes demonstrated with working code

diff --git a/docs/contributor/adr/ADR-003-graph-runtime-governance.md b/docs/contributor/adr/ADR-003-graph-runtime-governance.md
new file mode 100644
--- /dev/null
+++ b/docs/contributor/adr/ADR-003-graph-runtime-governance.md
@@ -1,0 +1,81 @@
+# ADR-003: Graph Runtime Governance — Framework-Agnostic Control Plane
+
+**Status:** Accepted
+**Date:** 2026-04
+**Context:** Enable Talon to govern external agent runtimes (LangGraph, LangChain, OpenAI SDK, MCP clients) as a true runtime control plane, not just a gateway proxy.
+
+---
+
+## Context
+
+Talon today provides deep governance for its **native Go runner** (policy evaluation, step-level evidence, tool access control, loop containment, plan review gate). However, external agent frameworks like LangGraph, LangChain, CrewAI, or custom OpenAI SDK scripts interact with Talon only through the **LLM gateway proxy** (`/v1/proxy/*`) or **MCP `tools/call`** endpoint.
+
+This means:
+
+- **LangGraph stateful graphs**: Talon sees individual LLM calls but not graph structure, node transitions, retry decisions, or branch paths.
+- **LangChain stateless calls**: Talon can enforce per-request policy but has no session/run-level correlation or lineage from the client side.
+- **Plan review**: Gates individual requests, not multi-step workflows.
+- **Evidence**: Two separate correlation IDs for plan-review vs dispatch execution; no graph-level summary.
+
+### Current integration boundaries
+
+| Surface | What Talon sees | What Talon controls |
+|---------|----------------|-------------------|
+| Native runner (`Runner.Run`) | Full pipeline: policy, PII, tools, steps, evidence | Abort, budget, tool deny, plan gate, hooks |
+| Gateway (`/v1/proxy/*`) | Single LLM request | Policy deny, PII redact, rate limit, model forward |
+| MCP (`/mcp` tools/call) | Single tool invocation | Tool access policy, evidence |
+| External LangGraph | Nothing beyond gateway traffic | Nothing beyond gateway traffic |
+
+### Code touchpoints (as-is)
+
+- Runner pipeline: `internal/agent/runner.go` — `Run()`, `executeLLMPipeline()`, agentic loop
+- Policy engine: `internal/policy/engine.go` — `Evaluate`, `EvaluateToolAccess`, `EvaluateLoopContainment`
+- Evidence: `internal/evidence/generator.go` — `Generate`, `GenerateStep`
+- Plan gate: `internal/agent/plan.go`, `internal/agent/plan_review.go`
+- Hooks: `internal/agent/hooks.go` — `HookPreTool`, `HookPostTool`, etc.
+- Gateway: `internal/gateway/gateway.go` — 10-step proxy pipeline
+- MCP server: `internal/mcp/server.go` — JSON-RPC tools/list + tools/call
+- LangChain pack: `internal/pack/wizard.go` (init template only, no runtime code)
+
+---
+
+## Decisions
+
+### 1. Framework-Agnostic Event Contract
+
+**Decision:** Define a canonical set of governance events (`run_start`, `step_start`, `step_end`, `tool_call`, `retry`, `run_end`) that any external runtime can emit to Talon via HTTP. LangGraph is a flagship use case but the contract is not LangGraph-specific.
+
+**Rationale:** Talon's target market uses diverse frameworks. Coupling to one creates adoption friction and maintenance burden.
+
+### 2. HTTP Control Plane Endpoints
+
+**Decision:** Add `/v1/graph/events` endpoint that accepts governance events and returns control decisions (allow, deny, override_model, mutate_args, require_review, abort). Events carry `graph_run_id`, `session_id`, `node_id`, `step_index`, and state metadata.
+
+**Rationale:** HTTP is the universal transport for notebooks, standalone apps, and microservices. Keeps Talon as a Go binary; client-side integration is a thin HTTP wrapper.
+
+### 3. Evidence Lineage Enhancement
+
+**Decision:** Add `PlanID` and `GraphRunID` fields to `GenerateParams` and `StepParams`. Add a `GraphSummary` evidence record type for run-level graph metadata. Link plan review, execution, and steps through these fields.
+
+**Rationale:** Auditors need one lineage from plan approval through graph execution to individual steps. Current split correlation IDs break this chain.
+
+### 4. Graph-Aware Policy Evaluation
+
+**Decision:** Add `EvaluateGraphGovernance` to the policy engine with Rego policy `graph_governance.rego`. Input includes node metadata, step counts, cost accumulation, retry state, and tool history. Output includes allow/deny plus control actions (model override, retry limit, budget abort).
+
+**Rationale:** Existing `EvaluateLoopContainment` only checks iteration/cost/tool counts. Graph governance needs node-level, retry-aware, and branch-aware decisions.
+
+### 5. Python-First Client SDK
+
+**Decision:** Ship a minimal Python package (`talon-sdk`) that wraps HTTP calls to the graph events endpoint. Provide LangGraph callback adapter and LangChain base-URL configuration as first-class examples.
+
+**Rationale:** LangGraph/LangChain users write Python. A 200-line SDK removes friction vs raw HTTP.
+
+---
+
+## Consequences
+
+- External runtimes gain full governance parity with native runner for policy, evidence, and control.
+- Evidence store grows by one table (`graph_summaries`) and two columns on existing tables.
+- New Rego policy file adds graph-specific deny rules without changing existing policy behavior.
+- Python SDK is out-of-tree but documented alongside Go binary releases.

diff --git a/docs/integration/langchain-langgraph.md b/docs/integration/langchain-langgraph.md
new file mode 100644
--- /dev/null
+++ b/docs/integration/langchain-langgraph.md
@@ -1,0 +1,403 @@
+# Integrating Talon with LangChain and LangGraph
+
+Talon governs AI agent execution with policy enforcement, PII detection, cost
+control, and signed audit trails. This guide covers two integration tracks and
+shows how to use each from notebooks and standalone applications.
+
+---
+
+## Architecture Overview
+
+```
+┌──────────────────┐     ┌───────────────────────────────────────┐
+│  Your Agent Code │     │           Talon Server                │
+│                  │     │                                       │
+│  LangChain ──────┼──→──┤  /v1/proxy/openai   (Gateway Proxy)  │
+│  (stateless)     │     │     ↓ PII scan → Policy → Route →    │
+│                  │     │       Evidence → Forward to LLM       │
+│  LangGraph ──────┼──→──┤                                       │
+│  (stateful)      │     │  /v1/graph/events   (Graph Events)   │
+│                  │     │     ↓ Policy → Evidence → Decision    │
+│  OpenAI SDK ─────┼──→──┤                                       │
+│  MCP clients ────┼──→──┤  /mcp               (MCP tools/call) │
+└──────────────────┘     └───────────────────────────────────────┘
+```
+
+---
+
+## Track 1: LangChain Stateless — Gateway Proxy
+
+The simplest integration. Point LangChain's `base_url` at Talon's
+OpenAI-compatible proxy. No SDK, no code changes beyond the URL.
+
+### What Talon handles automatically
+
+- PII detection and optional redaction on input and output
+- Policy evaluation (cost limits, rate limits, time restrictions)
+- Model routing with EU sovereignty enforcement
+- Cost tracking per tenant/agent
+- HMAC-signed evidence record per request
+
+### Notebook usage (Jupyter / Colab)
+
+```python
+# Cell 1: Install
+# !pip install langchain-openai
+
+# Cell 2: Configure and call
+import os
+from langchain_openai import ChatOpenAI
+
+llm = ChatOpenAI(
+    model="gpt-4o-mini",
+    temperature=0,
+    base_url="http://localhost:8080/v1/proxy/openai",
+    api_key=os.environ.get("TALON_CALLER_KEY", "your-caller-key"),
+    default_headers={
+        "X-Talon-Session-ID": "notebook-session-1",
+    },
+)
+
+response = llm.invoke("Summarize EU AI Act requirements for SMBs.")
+print(response.content)
+```
+
+### Standalone application usage
+
+```python
+import os
+from langchain_openai import ChatOpenAI
+
+llm = ChatOpenAI(
+    model="gpt-4o-mini",
+    temperature=0,
+    base_url=os.environ["TALON_URL"] + "/v1/proxy/openai",
+    api_key=os.environ["TALON_CALLER_KEY"],
+    default_headers={
+        "X-Talon-Session-ID": "worker-session-1",
+        "X-Talon-Reasoning": "batch-summarization",
+    },
+)
+
+response = llm.invoke("What are the key DORA requirements?")
+print(response.content)
+```
+
+### Expected evidence output
+
+Each request creates one evidence record with:
+
+```json
+{
+  "id": "req_abc123",
+  "correlation_id": "gw_xyz789",
+  "session_id": "notebook-session-1",
+  "tenant_id": "default",
+  "invocation_type": "gateway",
+  "policy_decision": {"allowed": true, "action": "allow"},
+  "execution": {
+    "model_used": "gpt-4o-mini",
+    "cost": 0.0003,
+    "duration_ms": 1200
+  },
+  "classification": {"input_tier": 0, "pii_detected": []}
+}
+```
+
+### Failure behavior
+
+- **Policy deny**: HTTP 403 with `{"error": "policy denied: daily limit exceeded"}`
+- **PII blocked**: HTTP 403 with `{"error": "PII detected in input"}`
+- **Rate limited**: HTTP 429 with retry-after header
+
+---
+
+## Track 2: LangGraph Stateful — Graph Events API
+
+For multi-step agents that need per-step governance, retry control,
+and graph-level evidence lineage.
+
+### Authentication
+
+The `/v1/graph/events` endpoint is protected by tenant key authentication.
+When `tenant_keys` are configured in `talon.config.yaml`, requests must
+include `Authorization: Bearer <tenant_key>`. In dev mode (no tenant keys
+configured), the endpoint is open.
+
+The Python SDK handles this automatically when you pass `tenant_key`:
+
+```python
+talon = TalonClient("http://localhost:8080", tenant_key="your-tenant-key")
+# All requests include: Authorization: Bearer your-tenant-key
+```
+
+For raw HTTP calls (curl, requests):
+
+```bash
+curl -X POST http://localhost:8080/v1/graph/events \
+  -H "Authorization: Bearer your-tenant-key" \
+  -H "Content-Type: application/json" \
+  -d '{"type": "run_start", "graph_run_id": "gr_001", ...}'
+```
+
+### Setup
+
+```python
+# !pip install langgraph langchain-openai requests
+
+# Copy talon_sdk.py from examples/langchain-integration/
+from talon_sdk import TalonClient
+
+talon = TalonClient(
+    base_url="http://localhost:8080",
+    tenant_key="your-tenant-key",
+)
+```
+
+### Event lifecycle
+
+```
+run_start ──→ step_start ──→ [tool_call] ──→ step_end ──→ ... ──→ run_end
+                                   │
+                                   └──→ [retry] (on failure)
+```
+
+Each event returns a Decision:
+
+```json
+{
+  "action": "allow",
+  "allowed": true,
+  "reasons": [],
+  "evidence_id": "ev_abc123"
+}
+```
+
+Currently emitted actions: `allow`, `deny`. The following actions are
+reserved for Phase 2 and not yet emitted by the adapter: `abort`,
+`override_model`, `mutate_args`, `require_review`, `retry`.
+
+The `evidence_id` field is populated when the evidence store is configured,
+linking the decision to its audit record.
+
+### Notebook usage (Jupyter / Colab)
+
+```python
+import time
+from talon_sdk import TalonClient
+from langchain_openai import ChatOpenAI
+from langgraph.graph import StateGraph, END
+from typing import TypedDict
+
+talon = TalonClient("http://localhost:8080", tenant_key="your-key")
+
+class State(TypedDict):
+    query: str
+    result: str
+
+def search(state: State) -> State:
+    dec = talon.tool_call(state["_run_id"], "agent", 0, "web_search",
+                          {"query": state["query"]})
+    if not dec["allowed"]:
+        raise RuntimeError(f"Denied: {dec['reasons']}")
+    return {**state, "result": f"Found: {state['query']}"}
+
+def answer(state: State) -> State:
+    llm = ChatOpenAI(model="gpt-4o-mini")
+    resp = llm.invoke(f"Answer from: {state['result']}")
+    return {**state, "result": resp.content}
+
+graph = StateGraph(State)
+graph.add_node("search", search)
+graph.add_node("answer", answer)
+graph.set_entry_point("search")
+graph.add_edge("search", "answer")
+graph.add_edge("answer", END)
+app = graph.compile()
+
+# Governed execution
+run_id = talon.new_run_id()
+talon.run_start(run_id, "agent", framework="langgraph", node_count=2)
+
+talon.step_start(run_id, "agent", 0, "search", node_type="tool")
+result = app.invoke({"query": "EU compliance 2026", "_run_id": run_id})
+talon.step_end(run_id, "agent", 0)
+
+talon.step_start(run_id, "agent", 1, "answer", node_type="llm")
+talon.step_end(run_id, "agent", 1, cost=0.001)
+
+talon.run_end(run_id, "agent", total_cost=0.001)
+print(result["result"])
+```
+
+### Standalone application usage
+
+```python
+import os
+import time
+from talon_sdk import TalonClient
+
+talon = TalonClient(
+    base_url=os.environ["TALON_URL"],
+    tenant_key=os.environ["TALON_TENANT_KEY"],
+)
+
+def governed_pipeline(query: str):
+    run_id = talon.new_run_id()
+
+    dec = talon.run_start(run_id, "pipeline-agent", framework="langgraph",
+                          node_count=3, planned_steps=["fetch", "process", "store"])
+    if not dec["allowed"]:
+        return {"error": dec["reasons"]}
+
+    total_cost = 0.0
+    start = time.time()
+
+    for i, step_name in enumerate(["fetch", "process", "store"]):
+        dec = talon.step_start(run_id, "pipeline-agent", i, step_name)
+        if not dec["allowed"]:
+            talon.run_end(run_id, "pipeline-agent", status="aborted")
+            return {"error": f"Step {step_name} denied"}
+
+        # ... execute step logic ...
+        step_cost = 0.001
+        total_cost += step_cost
+        talon.step_end(run_id, "pipeline-agent", i, cost=step_cost)
+
+    duration_ms = int((time.time() - start) * 1000)
+    talon.run_end(run_id, "pipeline-agent", total_cost=total_cost, duration_ms=duration_ms)
+    return {"status": "completed", "run_id": run_id}
+
+if __name__ == "__main__":
+    result = governed_pipeline("Process Q1 compliance data")
+    print(result)
+```
+
+### Expected evidence output
+
+Graph events produce both step-level and run-level evidence:
+
+```json
+{
+  "id": "req_run123",
+  "correlation_id": "gr_abc12345678",
+  "graph_run_id": "gr_abc12345678",
+  "invocation_type": "graph_run",
+  "execution": {
+    "cost": 0.003,
+    "duration_ms": 4500,
+    "tools_called": ["web_search"]
+  }
+}
+```
+
+Step evidence is linked by `correlation_id` = `graph_run_id`:
+
+```json
+{
+  "id": "step_xyz456",
+  "correlation_id": "gr_abc12345678",
+  "step_index": 0,
+  "type": "tool_call",
+  "tool_name": "web_search",
+  "status": "completed"
+}
+```
+
+### Failure and deny behavior
+
+- **Step denied**: Decision has `{"allowed": false, "action": "deny", "reasons": [...]}`
+- **Retry limit exceeded**: Decision has `{"allowed": false, "reasons": ["retry_count 4 exceeds max_retries_per_node 3"]}`
+- **Budget exceeded mid-run**: Decision has `{"allowed": false, "reasons": ["cost_so_far 5.0001 exceeds max_cost_per_run 5.0000"]}`
+- **Tool blocked**: Tool-specific deny from OPA tool_access policy
+
+The external runtime **must** respect deny/abort decisions and stop execution.
+
+---
+
+## Other Supported Patterns
+
+### OpenAI SDK (Python)
+
+```python
+from openai import OpenAI
+
+client = OpenAI(
+    base_url="http://localhost:8080/v1/proxy/openai",
+    api_key="your-caller-key",
+)
+resp = client.chat.completions.create(
+    model="gpt-4o-mini",
+    messages=[{"role": "user", "content": "Hello"}],
+)
+```
+
+### MCP Tool Invocation
+
+```bash
+curl -X POST http://localhost:8080/mcp \
+  -H "Authorization: Bearer your-tenant-key" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "jsonrpc": "2.0",
+    "method": "tools/call",
+    "params": {"name": "web_search", "arguments": {"query": "test"}},
+    "id": 1
+  }'
+```
+
+### When to use which pattern
+
+| Scenario | Pattern | Why |
+|----------|---------|-----|
+| Single LLM call | Gateway proxy | Zero friction, automatic governance |
+| LangChain chain | Gateway proxy | Each LLM call governed individually |
+| LangGraph graph | Graph events | Step-level control, lineage, retry governance |
+| Custom multi-step agent | Graph events | Full lifecycle control |
+| MCP tool execution | MCP endpoint | Native tool governance |
+| Quick PoC / demo | Gateway proxy | Fastest to set up |
+| EU AI Act compliance audit | Graph events | Full transparency and traceability |
+
+---
+
+## Configuration
+
+### `.talon.yaml` policy for graph-governed agents
+
+```yaml
+agent:
+  name: my-graph-agent
+  model_tier: 1
+
+policies:
+  cost_limits:
+    per_request: 2.0
+    daily: 50.0
+    monthly: 500.0
+  resource_limits:
+    max_iterations: 20       # max graph steps
+    max_cost_per_run: 5.0    # abort if cost exceeds
+    max_retries_per_node: 3  # retry governance
+  rate_limits:
+    requests_per_minute: 60
+
+capabilities:
+  allowed_tools:
+    - web_search
+    - calculator
+    - sql_database_query
+
+compliance:
+  frameworks: [gdpr, eu-ai-act]
+  human_oversight: on-demand
+```
+
+### `talon.config.yaml` server settings
+
+```yaml
+server:
+  port: 8080
+  admin_key: "your-admin-key"
+  tenant_keys:
+    default: "your-tenant-key"
+```

diff --git a/examples/langchain-integration/README.md b/examples/langchain-integration/README.md
new file mode 100644
--- /dev/null
+++ b/examples/langchain-integration/README.md
@@ -1,0 +1,74 @@
+# LangChain / LangGraph Integration Examples
+
+Talon can govern LangChain and LangGraph agents through two mechanisms:
+
+## 1. Gateway Proxy (simplest, no SDK)
+
+Point your LangChain `base_url` at Talon's OpenAI-compatible proxy. Talon
+automatically applies PII detection, policy evaluation, cost tracking, model
+routing, and evidence generation.
+
+```python
+from langchain_openai import ChatOpenAI
+
+llm = ChatOpenAI(
+    model="gpt-4o-mini",
+    base_url="http://localhost:8080/v1/proxy/openai",
+    api_key="your-caller-key",
+)
+response = llm.invoke("What is GDPR Article 30?")
+```
+
+Best for: **single LLM calls, stateless usage, quick integration**.
+
+## 2. Graph Events API (full governance)
+
+Send lifecycle events to `/v1/graph/events` for step-level control,
+retry governance, and evidence lineage across multi-step workflows.
+
+**Authentication:** When `tenant_keys` are configured in
+`talon.config.yaml`, requests require `Authorization: Bearer <tenant_key>`.
+The Python SDK sets this automatically when you pass `tenant_key`.
+
+```python
+from talon_sdk import TalonClient
+
+talon = TalonClient("http://localhost:8080", tenant_key="your-key")
+run_id = talon.new_run_id()
+talon.run_start(run_id, "my-agent", framework="langgraph")
+talon.step_start(run_id, "my-agent", 0, "search_node")
+# ... execute node ...
+talon.step_end(run_id, "my-agent", 0, cost=0.001)
+talon.run_end(run_id, "my-agent", total_cost=0.001)
+```
+
+Best for: **LangGraph stateful graphs, multi-step agents, compliance-heavy**.
+
+## Files
+
+| File | Description |
+|------|-------------|
+| `talon_sdk.py` | Lightweight Python client for graph governance events |
+| `langchain_stateless.py` | Single LLM call via gateway proxy + optional events |
+| `langgraph_stateful.py` | Multi-step LangGraph agent with per-step governance |
+| `notebook_example.py` | Colab/Jupyter-ready cells for both patterns |
+
+## Other Supported Patterns
+
+Talon is framework-agnostic. The same governance applies to:
+
+- **OpenAI SDK**: Point `base_url` at `http://localhost:8080/v1/proxy/openai`
+- **Anthropic SDK**: Use `http://localhost:8080/v1/proxy/anthropic`
+- **MCP clients**: Send tool calls to `POST /mcp` (JSON-RPC 2.0)
+- **Custom agents**: Use the graph events API with any HTTP client
+
+## When to Use Which
+
+| Scenario | Recommended Pattern |
+|----------|-------------------|
+| Single LLM call from notebook | Gateway proxy |
+| LangChain chain/pipeline | Gateway proxy |
+| LangGraph multi-step graph | Graph events API |
+| Custom agent with tool calls | Graph events API |
+| Quick proof-of-concept | Gateway proxy |
+| EU AI Act compliance audit | Graph events API |

diff --git a/examples/langchain-integration/langchain_stateless.py b/examples/langchain-integration/langchain_stateless.py
new file mode 100644
--- /dev/null
+++ b/examples/langchain-integration/langchain_stateless.py
@@ -1,0 +1,104 @@
+"""
+LangChain Stateless + Talon Governance — Single LLM Call
+
+Demonstrates the simplest integration: a single LangChain LLM call
+governed by Talon. This uses Talon as an OpenAI-compatible gateway
+so LangChain's base_url points to Talon's proxy endpoint.
+
+No graph events needed — Talon's gateway pipeline handles policy,
+PII detection, cost tracking, and evidence generation automatically.
+
+Works in notebooks and standalone scripts.
+
+Prerequisites:
+    pip install langchain-openai requests
+    export TALON_URL=http://localhost:8080
+    export TALON_CALLER_KEY=your-caller-api-key
+    export OPENAI_API_KEY=your-openai-key  # stored in Talon vault
+
+    # Start Talon with gateway:
+    talon serve --gateway --port 8080
+"""
+
+import os
+
+from langchain_openai import ChatOpenAI
+
+
+def run_stateless_call():
+    """Single governed LLM call through Talon gateway."""
+
+    talon_url = os.environ.get("TALON_URL", "http://localhost:8080")
+    caller_key = os.environ.get("TALON_CALLER_KEY", "")
+
+    # Point LangChain at Talon's OpenAI-compatible proxy.
+    # Talon handles: PII detection, policy evaluation, cost tracking,
+    # model routing, evidence generation — all transparently.
+    llm = ChatOpenAI(
+        model="gpt-4o-mini",
+        temperature=0,
+        base_url=f"{talon_url}/v1/proxy/openai",
+        api_key=caller_key,
+        default_headers={
+            "X-Talon-Session-ID": "notebook-session-1",
+            "X-Talon-Reasoning": "stateless-langchain-example",
+        },
+    )
+
+    response = llm.invoke("Summarize the key requirements of the EU AI Act for SMBs.")
+
+    print(f"Response: {response.content}")
+    print(f"Model: {response.response_metadata.get('model_name', 'unknown')}")
+
+    return response
+
+
+def run_with_governance_events():
+    """Single LLM call with explicit Talon governance events.
+
+    Use this pattern when you want step-level evidence and control
+    beyond what the gateway proxy provides automatically.
+    """
+    from talon_sdk import TalonClient
+
+    talon = TalonClient(
+        base_url=os.environ.get("TALON_URL", "http://localhost:8080"),
+        tenant_key=os.environ.get("TALON_TENANT_KEY", ""),
+    )
+
+    run_id = talon.new_run_id()
+
+    # Notify Talon (even for a single-step "run")
+    dec = talon.run_start(
+        graph_run_id=run_id,
+        agent_id="summarizer",
+        framework="langchain",
+        model="gpt-4o-mini",
+        node_count=1,
+        planned_steps=["llm_call"],
+    )
+    if not dec["allowed"]:
+        print(f"Denied: {dec.get('reasons', [])}")
+        return None
+
+    talon.step_start(run_id, "summarizer", 0, "llm_call", node_type="llm", model="gpt-4o-mini")
+
+    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
+    response = llm.invoke("What are DORA requirements for ICT risk management?")
+
+    talon.step_end(run_id, "summarizer", 0, status="completed", cost=0.0005, duration_ms=800)
+    talon.run_end(run_id, "summarizer", status="completed", total_cost=0.0005, duration_ms=800)
... diff truncated: showing 800 of 4278 lines

_{You can send follow-ups to this agent here.}

cursor · 2026-04-01T18:12:44Z

internal/agent/graphadapter/adapter.go

+	rs := val.(*runState)
+	rs.denied = true
+	rs.reasons = append(rs.reasons, reasons...)
+}


Race condition on shared runState fields

High Severity

trackDenial retrieves a *runState from sync.Map via LoadOrStore, then mutates rs.denied and rs.reasons without any lock. When concurrent HTTP requests arrive for the same graphRunID (realistic for parallel graph nodes), multiple goroutines race on the same *runState pointer. The append on rs.reasons is particularly dangerous — concurrent appends to the same slice can corrupt data or panic. A mutex is needed on runState to protect its fields.

cursor · 2026-04-01T18:12:45Z

internal/policy/rego/graph_governance.rego

+		input.retry_count,
+		max_retries,
+	])
+}


Missing MaxRetriesPerNode field makes config ineffective

Medium Severity

ResourceLimitsConfig has no MaxRetriesPerNode field, so any max_retries_per_node value set in .talon.yaml is silently dropped during YAML/JSON parsing. The Rego rule uses object.get(rl, "max_retries_per_node", 3) which always falls back to the default of 3, making the retry limit unconfigurable despite being documented as a policy knob in the ADR, integration docs, and smoke tests.

cursor · 2026-04-01T18:12:45Z

internal/agent/graphadapter/adapter.go

+type Adapter struct {
+	policyEngine  *policy.Engine
+	evidenceGen   *evidence.Generator
+	evidenceStore *evidence.Store


Adapter evidenceStore field is set but never read

Low Severity

The evidenceStore *evidence.Store field on Adapter is assigned in NewAdapter but never referenced in any method. All evidence operations go through a.evidenceGen instead. This is unused dead code that adds a misleading dependency and clutters the struct.

…ode, unused evidenceStore - Add sync.Mutex to runState and lock around trackDenial mutations to prevent data races when concurrent graph nodes report denials for the same run. consumeRunState now takes a snapshot under lock. - Add MaxRetriesPerNode field to ResourceLimitsConfig so the value from .talon.yaml is serialized into OPA data, making the Rego max_retries_per_node policy knob actually configurable. - Remove unused evidenceStore field from Adapter struct (all evidence ops go through evidenceGen). Keep the parameter in NewAdapter for API compatibility but ignore it.

…, tighten test budgets - Wire WithGraphEventsHandler in serve.go so /v1/graph/events returns 200 instead of 404 (fixes all section 30 smoke failures). - Add session_id and correlation_id to evidence Index struct and toIndex() so the plan_dispatch smoke test can match dispatched evidence by session (fixes section 24 failure). - Add wait_port_free before talon serve in section 12 to prevent stale-process 500 errors when port 8080 is occupied. - Enable WAL mode + busy_timeout on cache SQLite to reduce flaky cache-hit failures across sequential CLI invocations. - Add smoke_tighten_limits helper (per_request 0.50, daily 5, monthly 50, agent_total 5m) and call it from all 27 init-based smoke sections to cap runaway costs during testing.

Align section 24 evidence matching to session continuity and make section 30 load the active policy file while propagating session IDs across graph lifecycle events to keep governance assertions and exported evidence consistent.

…h events Document shared session_id propagation for graph lifecycle events and update integration examples plus the Python SDK so documented calls match current governance and evidence continuity behavior.

sergeyenin added 3 commits April 1, 2026 18:05

fix(graphadapter): use NewRequestWithContext to satisfy noctx linter

4c37a6b

cursor bot reviewed Apr 1, 2026

View reviewed changes

cursoragent and others added 4 commits April 1, 2026 18:19

fix(docs): align LangChain/LangGraph guidance with session-aware grap…

336d4c1

…h events Document shared session_id propagation for graph lifecycle events and update integration examples plus the Python SDK so documented calls match current governance and evidence continuity behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(graphadapter): Graph Runtime Governance — Framework-Agnostic Control Plane (ADR-003)#76

feat(graphadapter): Graph Runtime Governance — Framework-Agnostic Control Plane (ADR-003)#76
sergeyenin wants to merge 7 commits intomainfrom
feat/graph-runtime-governance

sergeyenin commented Apr 1, 2026 •

edited by cursor bot

Loading

Uh oh!

sergeyenin commented Apr 1, 2026

Uh oh!

cursor bot left a comment •

edited

Loading

Uh oh!

cursor bot Apr 1, 2026

Uh oh!

cursor bot Apr 1, 2026

Uh oh!

cursor bot Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sergeyenin commented Apr 1, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Core implementation (commit 1)

Delta improvements (commit 2)

Tests

Test plan

Uh oh!

sergeyenin commented Apr 1, 2026

Uh oh!

cursor bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Apr 1, 2026

Choose a reason for hiding this comment

Race condition on shared runState fields

Uh oh!

cursor bot Apr 1, 2026

Choose a reason for hiding this comment

Missing MaxRetriesPerNode field makes config ineffective

Uh oh!

cursor bot Apr 1, 2026

Choose a reason for hiding this comment

Adapter evidenceStore field is set but never read

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sergeyenin commented Apr 1, 2026 •

edited by cursor bot

Loading

cursor bot left a comment •

edited

Loading

Race condition on shared `runState` fields

Missing `MaxRetriesPerNode` field makes config ineffective

Adapter `evidenceStore` field is set but never read