Your AI doesn't know the rules. Writ fixes that.
Writ is a knowledge retrieval engine that gives AI coding agents real-time access to your organization's coding standards, architectural decisions, and enforcement rules -- without stuffing them all into the context window.
At 50 rules, you can paste your coding bible into the prompt. At 500, you can't. At 5,000, you need a system that knows which rules matter for the task at hand and delivers them in milliseconds.
Most teams solve this by hoping the AI remembers what it was told. Writ solves it with a queryable knowledge graph that understands rule relationships, contradictions, dependencies, and relevance -- and returns the right guidance before the AI writes a single line of code.
Writ plugs directly into Claude Code. No manual rule loading. No copy-pasting. No "remember to check the style guide."
Every time you send a message, Writ queries the knowledge graph with your prompt, ranks the most relevant rules, and injects them into Claude's context automatically. Claude sees the rules before it writes a single line.
You: "fix the async bug in the session tracker"
--- WRIT RULES (3 rules, standard mode) ---
[PY-ASYNC-001] (high, human, python) score=0.892
WHEN: Writing async Python code
RULE: All I/O operations must be async. Never use sync I/O in hot paths.
[ARCH-FUNC-001] (medium, human, architecture) score=0.756
WHEN: Function exceeds 50 lines
RULE: Functions must not exceed 50 lines. Extract helpers.
--- END WRIT RULES ---
Claude uses these rules. Not because it was asked to -- because the rules are already in context when it starts working.
This is where it gets interesting. Writ doesn't just serve rules -- it learns which ones work.
- Rules are injected before Claude writes code
- Static analysis runs on what Claude produces
- Outcomes are correlated with which rules were in context
- Feedback flows back to Writ automatically
- Rules that consistently produce good outcomes earn higher confidence
- Rules that don't get flagged for human review
When Claude encounters a situation where no rules exist, Writ detects the gap and prompts Claude to propose a new rule. The proposal passes through a structural quality gate, enters the graph as provisional, and graduates through observed outcomes. Your knowledge base grows from experience, not just from documentation.
- Claude follows your team's coding standards on every task, every project
- New engineers get the same institutional knowledge as senior staff
- When someone discovers a gotcha, it becomes a rule that protects everyone
- Rules that don't hold up in practice lose influence automatically
- The system gets smarter the more you use it
- Sub-millisecond retrieval -- p95 latency under 0.2ms at 80 rules, under 0.6ms at 10,000 rules. The AI gets relevant rules before you notice a delay
- Hybrid search -- combines keyword matching, semantic understanding, and graph traversal to find rules that keyword search alone would miss
- Relationship-aware -- knows that Rule A depends on Rule B, conflicts with Rule C, and supplements Rule D. Returns the full context, not isolated fragments
- AI-assisted rule evolution -- the system learns from your team's coding patterns. New rules are proposed, structurally validated, and queued for human review. Rules earn trust over time based on observed outcomes
- Authority-aware ranking -- human-authored rules outrank AI-proposed rules at equal relevance. The ranking guarantee is mechanical, not weight-based -- it cannot be circumvented by score manipulation
- Frequency-driven confidence -- rules that consistently produce good outcomes graduate to higher confidence. Rules that don't are flagged for review. No rule is promoted or demoted without statistical evidence
- Structural quality gate -- candidate rules pass five automated checks before entering the graph. The gate catches structural failures; human judgment remains the gate for correctness
- Rule compression -- clusters rules into abstraction nodes. When context budget is tight, the pipeline returns compressed summaries instead of individual rules. At 10K rules this yields a 727x context reduction
- Session-aware retrieval -- multi-query sessions track loaded rules client-side, preventing duplicates and decrementing the token budget across sequential queries. The server stays stateless
- Full development framework -- gate enforcement, static analysis, handoff validation, and session metrics hooks. Works with any language or framework
- Integrity checks -- detects contradictions, orphaned rules, stale guidance, near-duplicate rules, unreviewed proposals, and frequency-based staleness before they cause confusion
A local service sits between your AI agent and your knowledge base. The agent asks a question. Writ searches a graph database, scores candidates across five retrieval stages, and returns ranked results within a context budget. No cloud calls. No API keys. Fully offline.
Context-stuffing: 15,812 tokens (46 domain rules)
Writ retrieval: 1,397 tokens (5 rules, 0.2ms)
Ratio: 11x reduction
At 1,000 rules the reduction is 76x. At 10,000 rules context-stuffing would require 1.17 million tokens -- Writ returns ~1,600 tokens for a 727x reduction.
Install once. Run once per session. Every project, every agent, every tool gets the same live knowledge graph.
Measured on an 80-rule corpus with 83 ground-truth queries. All targets met.
| Stage | Component | p95 | Budget | Headroom |
|---|---|---|---|---|
| 2 | Keyword search | 0.175ms | 2.0ms | 11x |
| 3 | Semantic search | 0.047ms | 3.0ms | 64x |
| 4 | Graph traversal | 0.001ms | 3.0ms | 3000x |
| 5 | Ranking | 0.089ms | 1.0ms | 11x |
| -- | End-to-end | 0.19ms | 10.0ms | 53x |
| Metric | Value | Threshold |
|---|---|---|
| MRR@5 (19 ambiguous queries) | 0.7842 | > 0.78 |
| Hit rate (83 total queries) | 97.59% | > 90% |
| Context reduction | 4.4x (80 rules) | -- |
| Metric | 80 rules | 500 rules | 1K rules | 10K rules |
|---|---|---|---|---|
| E2E p95 | 0.19ms | 0.48ms | 0.57ms | 0.55ms |
| Context reduction | 4.4x | 39.3x | 75.8x | 727x |
| Clusters | 13 | 70 | 419 | 519 |
| Compression ratio | 5.6x | 7.6x | 2.8x | 25.2x |
| Session duplicates | 0 | 0 | 0 | 0 |
| Metric | Value | Budget |
|---|---|---|
| Cold start (80 rules) | 0.58s | < 3s |
| Memory (clean serve) | ~700 MB | < 2 GB |
| Integrity check | 23ms median, 50ms p95 | < 500ms |
282 tests and 12 performance benchmarks, all passing. Claude Code integration complete with full feedback loop.
Core retrieval (complete):
- Five-stage hybrid retrieval pipeline with optimized inference backend
- Two-pass ranking with graph-neighbor proximity scoring
- Rule compression via clustering into abstraction nodes
- Session-aware retrieval with client-side context tracking
Claude Code integration (complete):
- Automatic rule injection on every turn via hooks
- Automated feedback loop -- outcomes flow back to the knowledge graph
- Rule proposal trigger when knowledge gaps are detected
- Full development framework: gate enforcement, static analysis, handoff validation
- Works with any language or framework
Experiential memory (complete):
- Authority model with mechanical ranking guarantee (human over AI at equal relevance)
- Structural quality gate with five automated checks
- Human review queue for AI-proposed rules
- Frequency tracking with empirical confidence graduation
Future work:
- Domain generalization beyond coding rules
- Semantic gap detection for multi-query sessions
- Graph-level versioning for long-running sessions
This is the public face of Writ. The source code, rule corpus, retrieval pipeline, and implementation details live in a private repository.
Lucio Saldivar -- Infinri