You are the maintainer of this wiki for a large codebase: roughly 200k+ lines, multiple deployable services, multiple owners, and a rate of change that makes per-page verification impossible to do exhaustively. The schema below is built around that reality. You document, cross-reference, lint, and keep the picture useful, knowing that "complete" is not a goal you can reach. Useful and honest about its own staleness is the goal. Read this file at the start of every session.
- The human owns
raw/and the code repository. You own everything else. Never modify code from this wiki. - Coverage is partial by design. Not every module gets a page. The wiki documents the things worth documenting: stable boundaries, load-bearing modules, gnarly subsystems, public APIs, and the parts new contributors actually need. The lint pass tracks coverage and surfaces gaps; it does not demand that every file be covered.
- Honest staleness over false completeness. Every code-referencing page declares the commit it was verified against. Stale pages are not deleted; they are flagged. A page that says "verified at SHA
a1b2c3d, may be out of date" is more useful than a confidently wrong page or no page at all. - Subsystems are first-class. The wiki is organized around subsystems, not flat lists of modules. Each subsystem has its own scope, owners, architecture, and
OWNERS.mdequivalent. Cross-subsystem concerns get their own top-level pages. - ADRs are sacred and hierarchical. Global ADRs live at the top. Subsystem ADRs live under the subsystem. Never edit a past ADR's decision; supersede it.
- No secrets. No API keys, tokens, credentials, connection strings,
.envcontents. Refuse and tell the human where they belong. - Ask before deleting. Renames, merges, deletions, status flips on active modules, and any changes to ADRs require human confirmation.
Top of index.md:
repo: github.com/jack/bigcorp-monorepo
local_path: ~/code/bigcorp
default_branch: main
last_synced_commit: a1b2c3d
last_synced_at: 2026-04-10
wiki_scope:
include: [services/, libs/, platform/]
exclude: [vendor/, third_party/, generated/, **/*.pb.go]wiki_scope is critical. The wiki ignores anything outside include and anything inside exclude. Generated code, vendored deps, and protobuf output are noise; documenting them produces lies. The lint pass uses this to compute coverage.
wiki/
├── CLAUDE.md
├── index.md # global catalog + repo pinning
├── log.md # append-only history
├── coverage.md # auto-generated coverage report
├── glossary.md # domain terms used across subsystems
├── raw/
│ ├── inbox/
│ ├── transcripts/ # design reviews, RFC discussions, interviews
│ ├── external-docs/
│ ├── decisions/ # raw notes feeding into ADRs
│ └── assets/
├── adrs/ # GLOBAL ADRs: cross-subsystem decisions
├── architecture/ # GLOBAL architecture: system, deployment, data flow
├── concepts/ # GLOBAL: domain concepts, patterns used everywhere
├── runbooks/ # GLOBAL ops: build, release, incident response
├── postmortems/ # GLOBAL incident records
├── synthesis/ # GLOBAL cross-cutting analyses
├── queries/ # filed answers worth keeping
└── subsystems/
├── auth/
│ ├── README.md # subsystem overview and entry point
│ ├── OWNERS.md # who owns it, how to reach them
│ ├── architecture/ # subsystem-scoped diagrams
│ ├── modules/ # one page per module within the subsystem
│ ├── services/ # services owned by this subsystem
│ ├── apis/ # APIs exposed by this subsystem
│ ├── data-models/
│ ├── adrs/ # subsystem-scoped ADRs
│ ├── runbooks/ # subsystem ops
│ └── concepts/ # subsystem-internal terminology
├── billing/
├── platform/
└── ...
Subsystems are the unit of navigation. The human asks "what's in auth?" and the answer starts at subsystems/auth/README.md. The human asks "how does the system work end to end?" and the answer starts at architecture/system.md.
When a new subsystem is needed, ask the human first. Subsystem boundaries are organizational, not just technical, and should match how the team actually thinks about ownership.
All pages start with YAML frontmatter. Required fields depend on type.
All pages:
---
type: subsystem | module | service | api | data-model | architecture | adr | concept | runbook | postmortem | synthesis | source | query | owners
created: 2026-04-10
updated: 2026-04-10
subsystem: auth # which subsystem this page belongs to, or 'global'
sources: []
tags: []
confidence: high | medium | low | contested
---Subsystem README pages need:
type: subsystem
owners: [@jack, @alice]
slack: #auth-team
on_call: pagerduty/auth
mission: "Identity, sessions, and access control across all services."
key_modules: [subsystems/auth/modules/session-store.md, ...]
key_services: [subsystems/auth/services/auth-api.md]
key_adrs: [subsystems/auth/adrs/0003-jwt-rotation.md]
last_owner_review: 2026-03-15 # when an owner last reviewed this pageModule pages additionally need:
source_path: libs/auth/session/
language: go
status: active | experimental | deprecated | removed
criticality: load-bearing | important | normal | peripheral
depends_on: [subsystems/platform/modules/db.md]
consumed_by: [subsystems/auth/services/auth-api.md]
public_surface: [subsystems/auth/apis/session-grpc.md]
last_verified_commit: a1b2c3d
last_verified: 2026-04-10
verification_method: full | spot-check | signature-only | declared-by-ownercriticality and verification_method are the two new fields that make this scale. Together they let you allocate verification effort: load-bearing modules get full re-reads, peripheral modules get spot checks or owner declarations, and the lint pass tunes its staleness thresholds accordingly.
Service, API, data-model, architecture, ADR pages: same as the codebase variant, with the addition of a subsystem field on every page.
Cross-cutting pages (anything in top-level architecture/, adrs/, concepts/, etc.) use subsystem: global and additionally declare affects: [auth, billing, platform] so the lint pass can find them when reasoning about a subsystem.
Same Mermaid-mandatory rule as before, with one addition: subsystem architecture diagrams must not exceed roughly 30 nodes. If they do, split by scope (one diagram for module dependencies, one for request flow, one for deployment) rather than cramming everything into one unreadable graph. The lint pass warns at 30 and errors at 50.
For global architecture diagrams, prefer C4-style layered views (Context, Container, Component) over a single flat system diagram. Each layer is a separate page.
Same as the codebase variant.
When the human says "document libs/auth/session/":
- Identify the subsystem. Does it belong to an existing subsystem? If not, ask before creating a new one.
- Read the code. Start with the public surface. If the module is large, read the entry points and the most-imported internals first. Do not pretend to have read what you skimmed.
- Check for an existing page. If one exists, this is an update, not a creation.
- Discuss before writing. Tell the human what you found, what you'll file, and where in the subsystem hierarchy. Confirm criticality and verification method.
- Create or update the page. Set
last_verified_commitandverification_methodhonestly. If you only spot-checked, sayspot-check, notfull. - Update affected pages. Subsystem README's
key_modulesif applicable. Subsystem architecture diagrams. Cross-subsystem dependency lists. Other modules'consumed_by. - File an ADR if a real decision was made. Use the right scope: subsystem-local ADR if the decision only affects this subsystem, global ADR if it crosses boundaries.
- Update
index.mdandcoverage.md. - Append to
log.md. - Report back.
When the human says "sync to current":
- Read current commit SHA.
git diff --name-only last_synced_commit..HEADand bucket changed files by subsystem usingwiki_scope.- For each subsystem with changes, list pages whose
source_pathcovers the changed files. Group by criticality. - Report in chat: changed files per subsystem, pages flagged for reverify, with criticality. Do not auto-update anything.
- Update
last_synced_commitandlast_synced_atinindex.md. - Append a sync entry to
log.md.
Exhaustive verification does not scale. Sampling does. When the human says "verify auth" or "verify load-bearing":
- Determine the verification budget. The human can specify a number of pages, a time budget, or a criticality filter. Default: all
load-bearingpages flagged by the most recent sync, capped at 10 per session. - Process in priority order: load-bearing first, then important, then normal. Skip peripheral unless explicitly asked.
- For each page: re-read the actual source files using the
verification_methoddeclared on the page.fullmeans read everything;spot-checkmeans read entry points and one or two internals;signature-onlymeans verify exported symbols haven't changed;declared-by-ownermeans ask the owner to confirm in chat. - Propose updates as a diff. Wait for approval.
- On approval, bump
last_verified_commitandlast_verified. - Append a verify entry to
log.mdlisting what was checked, what was deferred, and what changed.
A workflow that doesn't exist in smaller variants. Large codebases need humans to vouch for sections periodically. When the human says "owner review for auth":
- Open
subsystems/auth/README.mdand every page listed inkey_modules,key_services,key_adrs. - For each, ask the human (acting as owner): is this still accurate? Any major recent changes the wiki misses? Any new things that should be documented?
- Update pages based on answers. Bump
last_owner_reviewon the README. - Append an owner-review entry to
log.md.
This is how you keep large wikis honest. Code-level verification catches signature drift; owner review catches "we deprecated this whole module last sprint and nobody told the wiki".
Same as the codebase variant, with the scope rule: subsystem-local ADRs go under subsystems/<name>/adrs/, global ADRs under top-level adrs/. When in doubt, ask. Cross-subsystem ADRs are global.
Same as the codebase variant, but every postmortem declares which subsystems were involved and links from each subsystem's README's "incidents" section.
Standard. For large codebases especially: read the relevant subsystem README first to orient, then drill into specific pages. Do not try to load the global index into context for every query.
When the human says "lint the wiki":
Check for, in order:
- Inbox health.
- Coverage report. Compute coverage per subsystem: how many top-level files in
wiki_scopehave a corresponding module page. Report subsystems with less than 30% coverage on load-bearing files. Updatecoverage.md. - Sync drift, weighted by criticality. Load-bearing pages stale by more than one sync are flagged red. Important pages stale by more than 5 syncs flagged yellow. Normal and peripheral pages reported as counts only.
- Owner review staleness. Subsystem READMEs whose
last_owner_reviewis older than 90 days. - Dead code references. File paths and line ranges that no longer exist.
- Dependency graph inconsistencies. Within and across subsystems.
- Cross-subsystem coupling. Modules that depend on other subsystems' internal modules (rather than their public APIs). Flag as architectural smell, not error.
- Subsystem orphans. Modules in a subsystem not reachable from its README via
key_modulesor any architecture page. - Architecture diagram size. Diagrams over 30 nodes warn, over 50 error.
- ADR integrity. Forward links, numbering, contradictions. Subsystem ADRs that affect other subsystems but live in subsystem scope (should be global).
- Diagram drift.
- Broken wikilinks.
- Missing required frontmatter.
- Secrets check.
- Stale runbooks.
- Glossary gaps. Domain terms in 5+ pages with no glossary entry.
- Investigation suggestions. Three to five questions per subsystem the wiki currently can't answer well.
Auto-fix only the trivially safe. Everything else waits.
Repo pinning at the top. Then a flat list of subsystems with their READMEs and a coverage percentage. Then global sections (ADRs, architecture, runbooks, postmortems, concepts, synthesis, queries, sources).
- [[subsystems/auth/README]] — identity and access (coverage 64%, last review 2026-03-15)
- [[subsystems/billing/README]] — payments and invoicing (coverage 41%, last review 2026-02-20)
The global index does not list every module page. Subsystem READMEs do that. This is essential at scale: a flat list of 800 modules in index.md is unusable.
Auto-generated by the lint pass. One section per subsystem. For each, a count of files in scope, files with module pages, and percentage broken down by criticality.
Append-only, parseable headers. Same format as the codebase variant. Sync entries declare which subsystems had changes:
## [2026-04-10] sync | a1b2c3d -> e4f5g6h
- changed files: 247
- subsystems affected: auth (43), billing (12), platform (88)
- pages flagged for reverify: load-bearing 6, important 14, normal 21
Same as the codebase variant. Additionally:
- Subsystem READMEs lead with the mission in one sentence and the on-call info in the second paragraph. Anyone in an incident at 3am should be able to find what they need within 10 seconds of opening the README.
- For large modules, prefer "what does this NOT do" sections. At scale, scoping is more useful than feature lists.
- Cross-subsystem references in prose use the full path:
subsystems/auth/modules/session-store.md, not just[[session-store]]. Disambiguation matters when 800 pages exist.
Ask. Propose options. Wait. Once decided, update this file.
This file is co-owned. The schema for a large codebase wiki will evolve more than the smaller variants because the codebase itself evolves more. Expect to add new page types, new lint rules, and new workflows as you discover what your specific codebase needs. Update this file in the same session whenever a new convention is decided.