Wrench Board

Agent-native diagnostic workbench for board-level electronics repair, powered by Claude Opus 4.7. Right-to-repair, built in the open, by the people who actually do the repairs.

📺 Demo video (3 min): https://youtu.be/OZ2D_p82z6w

What it is

Tens of millions of tonnes of electronics end up as e-waste every year. A large share of that is recoverable at the board level — a dead capacitor, a blown diode, a bad PMIC — but only a microsoldering technician can find and fix it. We are the last mile of repair before the landfill, and there are not many of us.

Wrench Board is a senior microsoldering teammate built for that last mile. For the seasoned tech, it's a second pair of eyes that never gets tired. For the apprentice, it's a senior teammate who explains the boot sequence the tenth time, in their language, with their tools, without judgment. It ingests a schematic PDF and a boardview, builds a per-device knowledge pack in two minutes, and runs an Opus 4.7 diagnostic agent that pilots the board visually — highlights pins, traces nets, simulates failures — while the technician keeps the iron in their hand.

The bet is precision over magic. The agent is not allowed to invent a reference designator. Every refdes it utters originates from a tool lookup, and a server-side sanitizer wraps any token it cannot verify before the text reaches the screen. The deterministic engines underneath produce verifiable causal chains, not vibes.

Why it exists

I've been a microsoldering technician for three years. For most of that time, I sent screenshots to Claude one at a time, manually, and pasted the answer into a paper notebook. I built the workbench I needed.

How it works

Four orthogonal workflows feed a single on-disk corpus per device under memory/{slug}/:

Knowledge Factory — four Claude personas (Scout, Registry, Writers, Auditor) build a verified repair pack from a device label in ~2 minutes. The three Writers (Cartographe / Clinicien / Lexicographe) run in parallel and share a cache-warmed prefix to amortize the long shared input across writers.
Schematic Ingestion — Opus 4.7 vision compiles a PDF schematic, page by page, into a queryable ElectricalGraph: nets classified, boot sequence inferred, quality report attached.
Diagnostic Agent — an Anthropic Managed Agent per device, with a four-store layered memory (global-patterns, global-playbooks, device-{slug}, repair-{repair_id}), pilots the boardview through 12 bv_* tools and queries the pack, schematic graph, measurements, validations and technician profile through ~24 more — 36 custom tools declared in api/agent/manifest.py. The agent never fabricates a refdes : tool discipline plus a post-hoc sanitizer.
microsolder-evolve — four overnight search loops, one per surface : the deterministic simulator + hypothesize engines (sim), the schematic compiler (pipeline), the schematic vision pass (pipeline-vision), and the diagnostic agent itself (agent). Each loop proposes patches against an oracle benchmark and either keeps them (evolve:-prefixed commit) or reverts. The loops have been running and shipping improvements while I work on other things.

Files + Vision — the agent can ask to see

A microsoldering diagnosis lives or dies on what the probe is touching right now, and a chat box can't carry that. The technician plugs a USB microscope or webcam into the workbench and the agent requests a frame on demand through the cam_capture tool, reads the image, and feeds it back into its reasoning. The technician can also drop a macro shot or a close-up of a suspect chip into the chat at any time. Captures and uploads are persisted under the repair so a session can be replayed end-to-end — words, decisions, and the actual photographs the agent looked at.

This closes the loop the screenshot-pasting workflow never could: the agent stops guessing what the board looks like and starts seeing it, on the technician's cue, on the technician's optics.

Under the hood

Backend — Python 3.11+ / FastAPI / native WebSocket / Pydantic v2 / pdfplumber. No build step, no bundler.
Frontend — vanilla HTML + CSS + JS, OKLCH design tokens, D3 v7 for the boardview and knowledge graph. Inline SVG icons. No framework.
Models — Claude Opus 4.7 (heavy pipeline writers, schematic vision, deep diagnostic tier), Claude Sonnet 4.6 (Scout, Registry, Mapper, Lexicographe, normal tier), Claude Haiku 4.5 (intent classifier, phase narrator, coverage gate, fast tier).
Memory — per-device Anthropic Managed Agents memory stores. The agent self-orients across sessions by reading its own scribe notebook (state.md, decisions/, measurements/, open_questions.md) instead of relying on an LLM-generated resume.
Boardview — 12 clean-room parsers in api/board/parser/, dispatched by extension: KiCad .kicad_pcb, OpenBoardView Test_Link .brd, KiCad-boardview BRD2, plus .asc .bdv .bv .cad .cst .f2b .fz .gr .tvw. Adding a format = one new file.
Tests — ~1100 functions across 133 files, including 10 deterministic invariants on the simulator + hypothesize engines and frozen-oracle accuracy gates marked @slow.
Anti-hallucination — defense in depth, two layers. (1) Tools return {found: false, closest_matches: [...]} for unknown refdes; the system prompt instructs the agent to pick from suggestions or ask the user. (2) api/agent/sanitize.py scans every outbound text for refdes-shaped tokens (\b[A-Z]{1,3}\d{1,4}\b) and wraps any unverified match as ⟨?U999⟩ before it reaches the technician.

Two pure-sync deterministic engines (simulator.py, hypothesize.py) sit at the core of the diagnostic stack. The simulator advances phase-by-phase over a boot sequence and emits a timeline of dead rails, dead components, and the cause of blocking per phase. The hypothesizer takes a partial observation and enumerates 1- and 2-fault refdes-kill candidates that explain it, ranked by F1 against the observation. Neither calls an LLM at runtime.

The diagnostic agent has two interchangeable runtimes — managed via Anthropic Managed Agents, direct via the Messages API. Managed is the default and the production path; direct serves as a fallback when the MA beta is unavailable and as an on-disk inspection harness during development. The WebSocket protocol is identical so the frontend doesn't know which one is running. See docs/ARCHITECTURE.md § Workflow D — Diagnostic Runtime for the full trade-off (pack mounting, cost profile, history persistence, cross-repair memory).

Roadmap — Community Evolution Loop

Wrench Board runs locally. Each technician's instance can improve its deterministic simulator against their own field cases. When the evolve loop discovers a rule that holds up, it surfaces a candidate pull request to the upstream repo. Right-to-repair, built in the open, by the people who actually do the repairs.

Quickstart

git clone https://github.com/Junkz3/wrench-board
cd wrench-board
make install          # create .venv and install deps (incl. [dev])
cp .env.example .env  # then fill in ANTHROPIC_API_KEY
make run              # uvicorn --reload on http://localhost:8000

On the first make run in Managed Agents mode (default), the start script prints a one-screen warning describing what it is about to create on your Anthropic account (1 environment + 4 tier-scoped agents — idle, no cost until used) and waits 5 seconds for Ctrl+C before bootstrapping. The IDs land in managed_ids.json (gitignored) and subsequent runs go straight to uvicorn.

Fallback to direct mode if the Managed Agents beta is unavailable on your account — no bootstrap, plain messages.create tool loop :

make demo-fallback
# or: DIAGNOSTIC_MODE=direct make run

See docs/ARCHITECTURE.md for the full architecture reference and CLAUDE.md for development conventions.

License & credits

Apache 2.0 — see LICENSE. All code in this repository was written from scratch; dependencies are MIT / Apache 2.0 / BSD only. The MNT Reform motherboard used as the canonical test target is CERN-OHL-S-2.0. Built solo at Repair Valley, an independent electronics repair workshop.

Contributing

Wrench Board is open to contributors who care about right-to-repair. Field reports, new boardview parsers, simulator rules — open an issue or a PR.

Name		Name	Last commit message	Last commit date
Latest commit History 762 Commits
.claude/skills		.claude/skills
api		api
benchmark		benchmark
board_assets		board_assets
docs		docs
memory		memory
scripts		scripts
tests		tests
web		web
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wrench Board

What it is

Why it exists

How it works

Files + Vision — the agent can ask to see

Under the hood

Roadmap — Community Evolution Loop

Quickstart

License & credits

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Wrench Board

What it is

Why it exists

How it works

Files + Vision — the agent can ask to see

Under the hood

Roadmap — Community Evolution Loop

Quickstart

License & credits

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages