dgenio · dgenio · Mar 4, 2026 · Mar 2, 2026 · Mar 2, 2026 · Mar 2, 2026
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -0,0 +1,46 @@
+name: CI
+
+on:
+  push:
+    branches: ["main", "copilot/**"]
+  pull_request:
+    branches: ["main"]
+
+jobs:
+  test:
+    name: "Python ${{ matrix.python-version }}"
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+    strategy:
+      matrix:
+        python-version: ["3.10", "3.11", "3.12"]
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Install dependencies
+        run: pip install -e ".[dev]"
+
+      - name: Lint (ruff check)
+        run: ruff check src/ tests/ examples/
+
+      - name: Format check (ruff format)
+        run: ruff format --check src/ tests/ examples/
+
+      - name: Type check (mypy)
+        run: mypy src/
+
+      - name: Test (pytest)
+        run: python -m pytest -q --cov=agent_kernel --cov-report=term-missing
+
+      - name: Examples
+        run: |
+          python examples/basic_cli.py
+          python examples/billing_demo.py
+          python examples/http_driver_demo.py
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,38 @@
+# AGENTS.md — AI Agent Instructions
+
+This file provides instructions for AI coding agents (Copilot, Cursor, etc.) working in this repository.
+
+## Repo layout
+
+```
+src/agent_kernel/   — library source (one module per concern, ≤300 lines each)
+tests/              — pytest test suite
+examples/           — runnable demos (no internet required)
+docs/               — architecture and security documentation
+```
+
+## Quality bar
+
+- `make ci` must pass before every commit.
+- All public interfaces need type hints and docstrings.
+- Use custom exceptions from `errors.py` — never bare `ValueError` or `KeyError`.
+- Keep modules ≤ 300 lines. Split if needed.
+- No randomness in matching, routing, or summarization (deterministic outputs).
+
+## Security rules
+
+- Never log or print secret key material.
+- HMAC secrets come from `AGENT_KERNEL_SECRET` env var; fall back to a random dev secret with a logged warning.
+- Tokens are tamper-evident (HMAC-SHA256) but not encrypted — document this.
+- Confused-deputy prevention: tokens bind to `principal_id + capability_id + constraints`.
+
+## Adding a new capability driver
+
+1. Implement the `Driver` protocol in `src/agent_kernel/drivers/`.
+2. Register it with `StaticRouter` or implement a custom `Router`.
+3. Add integration tests in `tests/test_drivers.py`.
+
+## Adding a new policy rule
+
+1. Add the rule to `DefaultPolicyEngine.evaluate()` in `policy.py`.
+2. Cover it with a test in `tests/test_policy.py`.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,20 @@
+# Changelog
+
+All notable changes to this project will be documented in this file.
+
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [Unreleased]
+
+## [0.1.0] - 2024-01-01
+
+### Added
+- Initial scaffold: `CapabilityRegistry`, `PolicyEngine`, `HMACTokenProvider`, `Kernel`.
+- `InMemoryDriver` and `HTTPDriver` (httpx-based).
+- Context `Firewall` with `Budgets`, redaction, and summarization.
+- `HandleStore` with TTL, pagination, field selection, and basic filtering.
+- `TraceStore` and `explain()` for full audit trail.
+- Examples: `basic_cli.py`, `billing_demo.py`, `http_driver_demo.py`.
+- Documentation: architecture, security model, integrations, capabilities, context firewall.
+- CI pipeline for Python 3.10, 3.11, 3.12 with ruff + mypy + pytest.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -0,0 +1,36 @@
+# Contributing to agent-kernel
+
+Thank you for your interest in contributing!
+
+## Development setup
+
+```bash
+git clone https://github.com/dgenio/agent-kernel.git
+cd agent-kernel
+python -m venv .venv
+source .venv/bin/activate
+pip install -e ".[dev]"
+```
+
+## Running checks
+
+```bash
+make fmt    # auto-format with ruff
+make lint   # lint with ruff
+make type   # type-check with mypy
+make test   # run pytest with coverage
+make ci     # all of the above + examples
+```
+
+## Pull request guidelines
+
+1. Keep PRs focused — one logical change per PR.
+2. Add or update tests for every behaviour change.
+3. All checks in `make ci` must pass.
+4. Follow the existing code style (ruff-enforced).
+5. Write docstrings on all public interfaces.
+
+## Security
+
+Please report security vulnerabilities privately via GitHub Security Advisories.
+Do **not** open a public issue for a security bug.
diff --git a/Makefile b/Makefile
@@ -0,0 +1,20 @@
+.PHONY: fmt lint type test example ci
+
+fmt:
+	ruff format src/ tests/ examples/
+
+lint:
+	ruff check src/ tests/ examples/
+
+type:
+	mypy src/
+
+test:
+	python -m pytest -q --cov=agent_kernel
+
+example:
+	python examples/basic_cli.py
+	python examples/billing_demo.py
+	python examples/http_driver_demo.py
+
+ci: fmt lint type test example
diff --git a/README.md b/README.md
@@ -1,2 +1,140 @@
 # agent-kernel
-Python library implementing a capability-based security kernel for AI agents operating in large tool ecosystems (MCP, A2A). Provides capability tokens, HMAC-signed authorization, policy engine, context firewall with budget enforcement, and pluggable drivers — so agents can safely use 1000+ tools without context blowup.
+
+[![CI](https://github.com/dgenio/agent-kernel/actions/workflows/ci.yml/badge.svg)](https://github.com/dgenio/agent-kernel/actions/workflows/ci.yml)
+[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/)
+[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
+
+A capability-based security kernel for AI agents operating in large tool ecosystems (MCP, A2A, 1000+ tools).
+
+## 30-second pitch
+
+Modern AI agents face three hard problems when given access to hundreds or thousands of tools:
+
+1. **Context blowup** — raw tool output floods the LLM context window.
+2. **Tool-space interference** — agents accidentally invoke the wrong tool or escalate privileges.
+3. **No audit trail** — there's no record of what ran, when, and why.
+
+`agent-kernel` solves all three with a thin, composable layer that sits above your tool execution layer:
+
+- **Capability Tokens** — HMAC-signed, time-bounded, principal-scoped. No token → no execution.
+- **Policy Engine** — READ/WRITE/DESTRUCTIVE safety classes + PII/PCI sensitivity handling.
+- **Context Firewall** — raw driver output is *never* returned to the LLM; always a bounded `Frame`.
+- **Audit Trail** — every invocation creates an `ActionTrace` retrievable via `kernel.explain()`.
+
+## Architecture
+
+```mermaid
+graph LR
+    LLM["LLM / Agent"] -->|goal| K["Kernel"]
+    K -->|search| REG["Registry"]
+    K -->|evaluate| POL["Policy Engine"]
+    K -->|sign| TOK["HMAC Token"]
+    K -->|route| DRV["Driver (MCP/HTTP/Memory)"]
+    DRV -->|RawResult| FW["Context Firewall"]
+    FW -->|Frame| LLM
+    K -->|record| AUD["Audit Trace"]
+```
+
+## Quickstart
+
+```bash
+pip install agent-kernel
+```
+
+```python
+import asyncio, os
+os.environ["AGENT_KERNEL_SECRET"] = "my-secret"
+
+from agent_kernel import (
+    Capability, CapabilityRegistry, HMACTokenProvider,
+    InMemoryDriver, Kernel, Principal, SafetyClass, StaticRouter,
+)
+from agent_kernel.drivers.base import ExecutionContext
+from agent_kernel.models import CapabilityRequest
+
+# 1. Register a capability
+registry = CapabilityRegistry()
+registry.register(Capability(
+    capability_id="tasks.list",
+    name="List Tasks",
+    description="List all tasks",
+    safety_class=SafetyClass.READ,
+    tags=["tasks", "list"],
+))
+
+# 2. Wire up a driver
+driver = InMemoryDriver()
+driver.register_handler("tasks.list", lambda ctx: [{"id": 1, "title": "Buy milk"}])
+
+# 3. Build the kernel
+kernel = Kernel(registry=registry, router=StaticRouter(routes={"tasks.list": ["memory"]}))
+kernel.register_driver(driver)
+
+async def main():
+    principal = Principal(principal_id="alice", roles=["reader"])
+
+    # 4. Discover → grant → invoke → expand → explain
+    token = kernel.get_token(
+        CapabilityRequest(capability_id="tasks.list", goal="list tasks"),
+        principal, justification="",
+    )
+    frame = await kernel.invoke(token, principal=principal, args={})
+    print(frame.facts)           # ['Total rows: 1', 'Top keys: id, title', ...]
+    print(frame.handle)          # Handle(handle_id='...', ...)
+
+    expanded = kernel.expand(frame.handle, query={"limit": 1, "fields": ["title"]})
+    print(expanded.table_preview)  # [{'title': 'Buy milk'}]
+
+    trace = kernel.explain(frame.action_id)
+    print(trace.driver_id)       # 'memory'
+
+asyncio.run(main())
+```
+
+## Where it fits
+
+```
+┌─────────────────────────────────────────────┐
+│             LLM / Agent loop                │
+├─────────────────────────────────────────────┤
+│  agent-kernel  ← you are here               │
+│  (registry · policy · tokens · firewall)    │
+├────────────────┬────────────────────────────┤
+│  contextweaver │  tool execution layer       │
+│  (context      │  (MCP · HTTP · A2A ·        │
+│   compilation) │   internal APIs)            │
+└────────────────┴────────────────────────────┘
+```
+
+`agent-kernel` sits **above** `contextweaver` (context compilation) and **above** raw tool execution. It provides the authorization, execution, and audit layer.
+
+## Security disclaimers
+
+> **v0.1 is not production-hardened for real authentication.**
+
+- HMAC tokens are tamper-evident (SHA-256) but **not encrypted**. Do not put sensitive data in token fields.
+- Set `AGENT_KERNEL_SECRET` to a strong random value in production. If unset, a random dev secret is generated per-process with a warning.
+- PII redaction is heuristic (regex). It is not a substitute for proper data governance.
+- See [docs/security.md](docs/security.md) for the full threat model.
+
+## Documentation
+
+- [Architecture](docs/architecture.md)
+- [Security model](docs/security.md)
+- [Integrations (MCP, HTTPDriver)](docs/integrations.md)
+- [Designing capabilities](docs/capabilities.md)
+- [Context Firewall](docs/context_firewall.md)
+
+## Development
+
+```bash
+git clone https://github.com/dgenio/agent-kernel
+cd agent-kernel
+pip install -e ".[dev]"
+make ci      # fmt + lint + type + test + examples
+```
+
+## License
+
+Apache-2.0 — see [LICENSE](LICENSE).
+
diff --git a/docs/architecture.md b/docs/architecture.md
@@ -0,0 +1,70 @@
+# Architecture
+
+## Overview
+
+`agent-kernel` is a capability-based security kernel that sits **above** raw tool execution (MCP, HTTP APIs, internal services) and **below** the LLM context window.
+
+```mermaid
+graph TD
+    LLM["LLM / Agent"] -->|goal text| K["Kernel"]
+    K -->|search| REG["CapabilityRegistry"]
+    REG -->|CapabilityRequest| K
+    K -->|evaluate| POL["PolicyEngine"]
+    POL -->|PolicyDecision| K
+    K -->|issue| TOK["TokenProvider (HMAC)"]
+    TOK -->|CapabilityToken| K
+    K -->|route| ROU["Router"]
+    ROU -->|RoutePlan| K
+    K -->|execute| DRV["Driver (Memory / HTTP / MCP)"]
+    DRV -->|RawResult| K
+    K -->|transform| FW["Firewall"]
+    FW -->|Frame| K
+    K -->|store| HS["HandleStore"]
+    K -->|record| TS["TraceStore"]
+    K -->|Frame| LLM
+```
+
+## Components
+
+### Kernel
+The central orchestrator. Wires all components together and exposes five methods:
+- `request_capabilities(goal)` — discover relevant capabilities
+- `grant_capability(request, principal, justification)` — policy check + token issuance
+- `invoke(token, principal, args, response_mode)` — execute + firewall + trace
+- `expand(handle, query)` — paginate/filter stored results
+- `explain(action_id)` — retrieve audit trace
+
+### CapabilityRegistry
+A flat dict of `Capability` objects indexed by `capability_id`. Provides keyword-based search (no LLM, no vector DB — purely token overlap scoring).
+
+### PolicyEngine
+The `DefaultPolicyEngine` implements role-based rules:
+1. **READ** — always allowed
+2. **WRITE** — requires `justification ≥ 15 chars` + role `writer|admin`
+3. **DESTRUCTIVE** — requires role `admin`
+4. **PII/PCI** — requires `tenant` attribute; enforces `allowed_fields` unless `pii_reader`
+5. **max_rows** — 50 (user), 500 (service)
+
+### TokenProvider (HMAC)
+Issues HMAC-SHA256 signed tokens. Each token is bound to `principal_id + capability_id + constraints`. Verification checks: expiry → signature → principal → capability.
+
+### Router
+`StaticRouter` maps `capability_id → [driver_id, ...]`. First driver that succeeds wins; others are tried as fallbacks.
+
+### Drivers
+- **InMemoryDriver** — Python callables, used for tests and demos
+- **HTTPDriver** — `httpx`-based async HTTP client
+- (Future) **MCPDriver** — adapter for Model Context Protocol tool servers
+
+### Firewall
+Transforms `RawResult → Frame`. Never exposes raw output to the LLM.
+- Four response modes: `summary`, `table`, `handle_only`, `raw`
+- Enforces `Budgets` (max_rows, max_fields, max_chars, max_depth)
+- Redacts sensitive fields and inline PII patterns
+- Deterministic summarisation (no LLM)
+
+### HandleStore
+Stores full results by opaque handle ID with TTL. `expand()` supports pagination, field selection, and basic equality filtering.
+
+### TraceStore
+Records every `ActionTrace`. `explain(action_id)` returns the full audit record.