feat: backend API test harness — 36 tests across 6 trust boundaries by MichaelCordner · Pull Request #141 · jentic/jentic-mini

MichaelCordner · 2026-04-01T13:55:28Z

Summary

Adds the first backend test suite for Jentic Mini — a pytest-based test harness that verifies the API's trust boundaries at the HTTP level.

Testing strategy: harness engineering

This is not a traditional unit test suite. Jentic Mini is a harness — it constrains AI agents by injecting credentials they never see, enforcing policies they can't bypass, and tracing every API call they make. The tests verify that the harness itself is trustworthy.

The approach follows harness engineering principles:

Test at the HTTP boundary, not internal functions. Every test sends a real HTTP request and asserts on the response. This makes the tests portable — they verify the API contract regardless of the underlying implementation.
Real database, no mocks. Each test run creates a temp SQLite DB, runs real Alembic migrations, and exercises real encryption. The only thing skipped is network-dependent startup (catalog refresh, BM25 index, self-registration).
Organized by trust boundary, not by source file. Each test file covers one security perimeter: auth, policy, vault, broker, etc.
Invariant tests over coverage metrics. We don't aim for line coverage — we aim for "every trust boundary has a test that would catch a violation."

What's covered (36 tests, 6 files)

File	Trust boundary	Key assertions
test_health_and_meta.py	API contract	/health and /version return correct shapes, version string present, telemetry opt-out works
test_auth_boundary.py	Auth perimeter	401 without key, 401 with bogus key, agent key works for search, agents blocked from credential writes, human session accesses protected endpoints, public paths need no auth
test_policy_engine.py	Policy enforcement	System safety rules deny all writes by default, deny sensitive paths, agent allow rules override system deny (first-match-wins), method and path matching semantics
test_credential_vault.py	Credential isolation	Write-only invariant: credential values are never returned on GET (single, list, or after create). Values encrypted and stored correctly, api_id binding works
test_toolkit_lifecycle.py	Toolkit management	Default toolkit exists, list returns counts (regression #60), key creation and listing, credential count matches total for default toolkit
test_broker_contracts.py	Broker routing	Dot-in-host requirement, error response on unknown host, unauthenticated passthrough behavior

What's NOT yet covered

This is a starting point. The following trust boundaries need tests in follow-up PRs:

Broker credential injection — verifying that the right credential gets injected for the right host/toolkit combination (requires upstream mock or simulate mode)
Broker fail-closed behavior — verifying that exceptions during credential resolution or policy check result in denial, not passthrough (relates to Security: Broker catch-all proxies unregistered operations with credentials, bypassing RBAC policies #95 phase 1)
Policy enforcement at the broker level — API-level tests where a credentialled broker call is denied by policy rules (vs the pure-function tests we have now)
IP allowlisting — per-key CIDR restriction enforcement
Key revocation — revoked keys immediately rejected
Credential update/delete — PATCH and DELETE contract tests
Toolkit CRUD — create custom toolkit, bind credentials, delete
Access request flow — agent requests permission, human approves, policy applied
Workflow execution — Arazzo workflow dispatch through the broker
Search ranking — BM25 results are relevant and ordered correctly
Rate limiting — when implemented

Infrastructure

tests/conftest.py — temp SQLite DB, minimal test lifespan (migrations only), admin session + agent key fixtures
requirements.txt — adds pytest>=8.0,<9 and pytest-asyncio>=0.24,<1
.github/workflows/ci-backend.yml — new CI job, path-filtered to src/, tests/, alembic/, requirements.txt

How to run

pip install -r requirements.txt
python -m pytest tests/ -v

36 tests, 0.63 seconds, zero network calls.

Test plan

All 36 tests pass locally
CI backend-tests job passes on this PR
Docker build still works (test deps don't affect production image)

Adds a pytest-based test harness that verifies Jentic Mini's API contracts at the HTTP boundary. Uses a real temp SQLite DB with real Alembic migrations — no mocking of the database or vault. Test files organized by trust boundary: - test_health_and_meta: /health and /version contract shapes - test_auth_boundary: 401/403 perimeter, agent vs human session - test_policy_engine: _check_policy pure function + system safety rules - test_credential_vault: write-only invariant (values never returned) - test_toolkit_lifecycle: CRUD, key management, credential counts (#60) - test_broker_contracts: dot-in-host routing, error responses Infrastructure: - tests/conftest.py: temp DB, minimal test lifespan, auth fixtures - pytest + pytest-asyncio added to requirements.txt - ci-backend.yml: runs on src/tests/alembic changes, path-filtered 36 tests, 0.63 seconds, zero network calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: backend API test harness — 36 tests across 6 trust boundaries#141

feat: backend API test harness — 36 tests across 6 trust boundaries#141
MichaelCordner wants to merge 1 commit intomainfrom
feat/backend-test-harness

MichaelCordner commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MichaelCordner commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing strategy: harness engineering

What's covered (36 tests, 6 files)

What's NOT yet covered

Infrastructure

How to run

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MichaelCordner commented Apr 1, 2026 •

edited

Loading