Skip to content

feat(health): knowledge health dashboard with auto-fix + review queue#182

Closed
bk-ty wants to merge 51 commits intokenforthewin:mainfrom
bk-ty:feat/health-dashboard
Closed

feat(health): knowledge health dashboard with auto-fix + review queue#182
bk-ty wants to merge 51 commits intokenforthewin:mainfrom
bk-ty:feat/health-dashboard

Conversation

@bk-ty
Copy link
Copy Markdown
Contributor

@bk-ty bk-ty commented May 2, 2026

Knowledge Health Dashboard

Introduces a comprehensive health system for the knowledge base: scheduled + on-demand checks, tiered auto-fix pipeline with undo, a Review Queue UI for manual decisions, and configurable per-check weighting.

Runs fully in atomic-core; both the Tauri app and atomic-server expose the same surface through their existing thin wrappers.

What this adds

Checks (atomic-core::health)

Ten built-in checks grouped into three categories:

Integrity (score-contributing by default)

  • embedding_coverage — atoms missing embeddings
  • tagging_coverage — atoms missing tags
  • orphan_tags — tags with zero atoms
  • source_uniqueness — duplicate source URLs
  • semantic_graph_freshness — stale semantic_edges
  • tag_health — rootless tags, single-atom tags, duplicates
  • broken_internal_links[[wikilink]] + [text](path) resolution
  • content_overlap — duplicate content via content_hash

Opinionated (informational by default, opt-in to score)

  • wiki_coverage, content_quality, contradiction_detection, boilerplate_pollution

Custom checks — user-defined rules via health/custom.rs (tag required/forbidden, tag cardinality, source domain patterns). UI lives in CustomChecksPanel.tsx.

Weighted mean score 0–100 → healthy / needs_attention / degraded / unhealthy. Per-DB HealthConfig lets users disable checks or override weights (including lifting informational checks into the score).

Auto-fix pipeline

Four safety tiers: Safe (retry pipelines), Low (delete orphan tags, generate missing wikis), Medium (modify content, opt-in via include_medium), High (requires explicit user action).

  • Every action is logged to health_fix_log and undoable via undo_health_fix.
  • 10-second in-UI undo toast after each batch run.
  • Dismissal system (health_dismissals) lets users suppress individual items with GC (dismissal_gc_task).
  • LLM-assisted fixes (llm_fixes.rs) for tag merge/rename/reparent/delete and boilerplate rewrites — all proposed, never executed without review.

Review Queue UI

  • /health page with two tabs: Overview & Review Queue and Tag Structure.
  • Per-check expand rows with examples, counts, and per-check run/exclude controls.
  • Severity / fixability / sort filters.
  • Tag proposals (LLM-generated tree reorganizations) with apply/dismiss.
  • Keyboard shortcuts (r refresh, f fix, e export, ? help, 1-9 expand rows).

Schema changes

Migrations V17-V21 (renumbered from upstream V16 autotag during rebase):

  • health_reports, health_fix_log (V17)
  • atom_chunks.content_hash (V18)
  • health_dismissals (V19)
  • tag_proposals (V20)
  • atoms.is_locked (V21)

All additive, idempotent (guarded by pragma_table_info / IF NOT EXISTS).

LATEST_VERSION bumped to 21.

API surface

New endpoints on atomic-server:

  • GET /api/health/knowledge — full report
  • POST /api/health/fix — batch auto-fix with mode + tier
  • POST /api/health/check/:name — run a single check
  • GET/POST /api/health/config — per-DB config
  • GET/POST /api/health/dismissals
  • GET /api/health/custom-checks / PUT /api/health/custom-checks
  • POST /api/health/undo-fix/:fix_id
  • GET /api/health/tag-proposals / POST /api/health/tag-proposals/:id/apply

Tauri exposes the same commands through the existing HttpTransport sidecar.

Architecture notes

  • All logic in atomic-core per AGENTS.md. Server routes are thin: parse request → core.health_*() → JSON.
  • Per-DB state uses core.storage() directly, not core.get_settings(), to bypass the registry fall-through (see AGENTS.md "Multi-DB Gotchas").
  • Background tasks (dismissal GC, health snapshot task) iterate manager.list_databases() and key locks on (task_id, db_id) to match the existing per-DB scheduler pattern.
  • Postgres parity: health dispatch methods in storage/mod.rs return explicit DatabaseOperation errors on the Postgres arm, so callers surface "not yet supported" rather than seeing silently empty reports. Three content-hash fallbacks return empty results with inline rationale (the column doesn't exist in the Postgres schema yet, so "no boilerplate detected" is a correct no-op).

Verification

cargo check  -p atomic-core -p atomic-server                       clean
cargo check  -p atomic-core -p atomic-server --features postgres   clean
cargo test   -p atomic-core -p atomic-server                       445 passed (337 core + 108 server)
cargo clippy -p atomic-core --lib --no-deps                        only pre-existing warnings
npx tsc --noEmit                                                   clean
npx vitest run                                                     115 / 115

Files

  • crates/atomic-core/src/health/ — 15 files (mod, types, score, checks, fixes, llm_fixes, custom, audit, task, link_resolution, gc_task, tests)
  • crates/atomic-core/src/storage/sqlite/health.rs — SQLite queries
  • crates/atomic-core/src/storage/mod.rs — dispatch + Postgres error arms
  • crates/atomic-server/src/routes/health.rs — REST routes
  • src/components/health/ — 7 files (HealthPage, HealthPanel, HealthConfigTab, TagStructureTab, CustomChecksPanel, WikiExclusionPanel, tests)
  • src/components/dashboard/widgets/review/ — Review Queue components
  • Migrations: crates/atomic-core/src/db.rs
  • Docs: docs/plans/2026-05-01-health-review-queue-audit/audit.md, docs/plans/2026-05-01-frontend-health-audit/audit.md

Migration / upgrade notes

Existing databases auto-migrate on next open. No data changes beyond new tables/columns. Safe to roll back by reopening on a pre-v1.36 binary (migrations are additive).

bk-ty added 30 commits May 2, 2026 12:39
…hboard

- Add previous_score and previous_check_scores fields to HealthReport struct
  with serde skip_serializing_if so existing stored JSON deserializes cleanly
- Populate those fields in compute_health by fetching the latest stored report
  before writing the new one (enables per-check and overall score trending)
- Export getTrend helper from HealthCheckRow.tsx; add trend and severityBadge
  optional props to HealthCheckRowProps; render trend arrow and severity emoji
  in the check row header
- Add FilterState type with severity/fixable/sort dimensions and DEFAULT_FILTER
- Add getSeverityBadge and getVisibleChecks helpers to HealthWidget.tsx
- Wire filter state into HealthPanel: issueChecks now uses getVisibleChecks,
  filter bar (severity / fixable / sort + Clear button) shown above check list
- Overall score header shows trend arrow coloured green/red/grey based on delta
- HealthCheckRow render passes trend and severityBadge from report data

Verified: cargo check -p atomic-core, cargo check -p atomic-server, npx tsc --noEmit all pass
- Add AtomPreview, BoilerplateAtomEntry, ContradictionAtom,
  ContradictionPairEntry, RootlessTagEntry to health/mod.rs
- HealthRawData: no_source_atoms → Vec<AtomPreview>,
  boilerplate_affected_atoms → Vec<BoilerplateAtomEntry>,
  add contradiction_pairs: Vec<ContradictionPairEntry>,
  add rootless_tag_list: Vec<RootlessTagEntry>
- Storage queries enriched:
  - no_source: fetch content+created_at, build AtomPreview
  - rootless_tags: fetch id/name/atom_count list, count from list
  - boilerplate: JOIN atoms, return title+clone_count
  - contradiction: fetch actual atom pairs (0.80–0.92 similarity),
    build ContradictionPairEntry with titles and sources
- checks.rs: surface rich objects in JSON output for no_source,
  boilerplate_affected_atoms, contradiction pairs, rootless_tag_list
- Add health/tests.rs with 30 unit tests covering all check fns
  and aggregate_score
Strips shared boilerplate chunks from semantic search (vec_chunks) while
preserving them in atom_chunks for FTS and display.

- Add crates/atomic-core/src/boilerplate.rs with normalize_for_dedup,
  content_hash, and boilerplate_indices (with all-boilerplate fallback)
- V17 migration: adds content_hash TEXT column + index on atom_chunks
- save_chunks_for_atom: stores content_hash, skips vec_chunks for empty embeddings
- SqliteStorage: count_chunk_hash_occurrences_impl, delete_vec_chunks_by_ids_impl,
  backfill_content_hashes_impl
- StorageBackend: dispatch wrappers for all three new methods
- process_embedding_only_inner: partition chunks, embed only non-boilerplate,
  save all to atom_chunks (boilerplate with empty vec)
- process_existing_chunk_reembedding_batch_inner: detect and remove boilerplate
  vec_chunks entries in the re-embed path
- Tests: 8 unit tests in boilerplate.rs + 1 integration test in health/tests.rs

cargo check -p atomic-core: clean (warnings only)
cargo test -p atomic-core -- boilerplate: 13 passed
cargo test -p atomic-core -- health: 32 passed
…erplate tabs

- Add shared types/applyFix helper in review/types.ts
- NoSourceRow: add_source URL input + mark_intentional dismissal
- TagRootlessRow: move_under parent select + dismiss
- BoilerplateAtomRow: reembed button with status feedback
- Refactor BoilerplateSection, ContentQualitySection, TagHealthSection to use new
  row components with per-section local removal state (items disappear on resolve)
- TagHealthSection uses useTagsStore to populate parent dropdown
- Add afterEach(cleanup) to NoSourceRow tests to prevent DOM bleed
- All 14 tests pass, tsc --noEmit clean
…ed overlap list

- Add local report state with re-sync effect so Re-scan can mutate per-tab
- Add rescanTab() calling health_check_single and splicing result into state
- Add TabHeader subcomponent: Re-scan button, last-scanned timestamp (relative),
  resolved-today counter with progress bar
- Replace single resolvedCount with per-tab resolvedByTab map persisted to
  localStorage keyed by active DB id with automatic daily reset
- Derive scalar resolvedCount from the map to keep header sub-text working
- Replace setResolvedCount calls with bumpResolved(checkName) in applyPairFix,
  BoilerplateSection, ContentQualitySection, TagHealthSection
- Add VirtualizedPairList using @tanstack/react-virtual when >50 overlap pairs
- Add RefreshCw to lucide-react imports; add useRef + useVirtualizer imports
- Add 2 new tests: Re-scan button render, boilerplate tab counter smoke test
…e refactor

- Refactor apply_manual_fix into shared apply_manual_fix_impl (returns Result)
  so both single and batch handlers share logic without duplication; all
  HttpResponse::BadRequest branches converted to AtomicCoreError::Validation
- Add POST /api/health/fix/batch: processes items sequentially, returns
  per-item {check, item_id, ok, error?} result map
- Add strip_boilerplate_atom to atomic-core health/llm_fixes: prompts LLM to
  remove template boilerplate, dry_run supported, logs audit fix on write
- Add POST /api/health/strip-boilerplate/{atom_id} route handler
- Register both new routes in configure_routes
- Add health_strip_boilerplate and health_fix_batch to command-map.ts
- Add health_batch_tests.rs integration test for multi-dismiss semantics
…late, markdown export, debounced sync

- Add multi-select checkboxes + sticky bulk-action footer to all 5 review sections
  (content_overlap, boilerplate_pollution, contradiction_detection, content_quality, tag_health)
- Wire bulk actions to new health_fix_batch endpoint for each section
- Add Strip… button to BoilerplateAtomRow: dry_run preview with lineDiff rendering,
  Apply strip calls health_strip_boilerplate, Cancel dismisses preview
- Add Clipboard icon button in modal header that renders active tab as markdown
  and writes to navigator.clipboard; swaps to Check on success (copiedFlash)
- Replace immediate fetchHealth callback on resolution with 2-second debounced
  scheduleRefetch; force-refetch on modal close cancels pending debounce
- ContradictionSection now accepts onResolved and propagates bumpResolved
- Tests: MarkdownExport.test.tsx, extended PairRow.test.tsx (checkbox batch),
  extended BoilerplateAtomRow.test.tsx (strip preview flow)
…rver chat override

- Add FixedChatResponder struct to test support: returns a fixed content
  string for any /v1/chat/completions request, with counter tracking.
- Add MockAiServer::mock_chat_completion(content) that mounts the
  FixedChatResponder with priority 1 (highest), overriding the default
  ChatResponder which returns {} for non-schema requests.
- Add strip_boilerplate_tests.rs with 3 tests:
  - dry_run_does_not_mutate: verifies proposed content returned, atom
    content unchanged in the database.
  - apply_mutates_and_logs_fix: verifies atom updated to LLM response
    and FixAction returned.
  - rejects_empty_response: verifies EMPTY response triggers an error
    containing 'boilerplate' or 'empty'.
…actions

- Storage: replace count_similar_name_pairs with collect_similar_name_pairs
  returning Vec<(id_a, name_a, id_b, name_b)>; derive count from list len
- HealthRawData: add similar_name_pairs_list field
- checks::tag_health: emit similar_name_pair_list JSON array with pair_id
- apply_dismissals: filter similar_name_pair_list by pair_id in tag_health branch
- atomic-server health routes: add merge_tags action for tag_health
  (item_id = 'a_id__b_id', winner via into_tag_id; auto-dismisses pair after merge)
- HealthReviewModal: render similar-name pair rows with Keep A / Keep B / Ignore
  buttons; pair checkboxes wired into bulk-dismiss footer; tab visible when
  similar pairs > 0 (not just rootless tags)
- Tests: Rust unit test for similar_name_pairs_list shape; 4 vitest cases covering
  merge Keep A, merge Keep B, ignore, and render
…with single-atom tags

- Add BrokenLinksSection component (review/BrokenLinksSection.tsx) with:
  - Per-atom grouping with header showing title + broken link count
  - Per-link rows: raw (monospace), target (dimmer), kind pill (wikilink=purple/markdown=gray)
  - Remove link button dispatches apply_health_item_fix(action=remove_link, content=link.raw)
  - Ignore button per link, Ignore atom button dismisses all atom's links
  - Batch selection with bulk Ignore N footer
- Add broken_internal_links tab to HealthReviewModal (label: 'Broken links')
- Extend TagHealthSection with single_atom_tag_list sub-section:
  - Autotag rows: Delete button (dispatches delete_tag) + Ignore
  - Manual tag rows: Merge into... select + Merge button (dispatches merge_into_parent) + Ignore
  - Shared bulk-dismiss selection works across all sub-sections
- Extend copyAsMarkdown to handle broken_internal_links tab
- Add BrokenLinksSection.test.tsx and TagHealthSection.test.tsx (all 10 test files pass)
- Replace heavy card layout (p-2.5 bg card) with flat border-b rows
- Remove opacity-0 group-hover wrapper — buttons always visible
- Add Link… button per row opening inline search picker
  - Seeds query with link.target on open
  - Debounced call to health_broken_link_suggest (200ms)
  - Shows suggestions; clicking dispatches apply_health_item_fix relink
  - Cancel button closes picker
- Button order: [Link…] [Remove link] [Ignore], px-1.5 py-0.5 text-[11px]
- Add Suggestion type
- Extend tests: button visibility, Link… picker open/prefill,
  suggestions render, relink dispatch, existing tests updated for
  always-visible buttons (no hover needed)
- GET /api/health/broken-link-suggest?q=<query>&limit=N
  Returns scored atom candidates for a broken link query.
  Search strategy: exact source_url suffix match (1.0), title
  prefix (0.8), title contains (0.6), content LIKE on first 80
  chars (0.4). Deduped by atom_id, sorted desc, capped at 20.

- POST /api/health/fix/{check}/{item_id} with action=relink
  Rewrites a broken link in-place to atom://<target_atom_id>.
  Accepts content=link_raw, into_tag_id=target_atom_id.
  Logs audit trail at tier medium.

Storage: suggest_atoms_by_query_impl / _sync in sqlite/health.rs.
Facade: AtomicCore::suggest_atoms_for_broken_link exposed publicly.
title_preview promoted to pub(crate) for use in fixes.rs.
Frontend: health_broken_link_suggest entry added to command-map.ts.
Tests: 2 new integration tests (suggest + relink) — all 49 health
tests pass.
scan_back_for_display_text always returned String::new() for the
original field, causing every markdown broken link to have raw: ""
in the BrokenLinkDetail payload.

When the frontend called apply_health_item_fix with action relink,
it sent content: "" which failed the server-side validation guard
(Some(c) if !c.trim().is_empty()), surfacing as:
  Validation error: content (link_raw) is required for relink

Fix: pass end_pos (the closing ) index) to scan_back_for_display_text
so it can slice bytes[k..end_pos+1] for the full [display](href) span.
Add regression tests for both mid-content and position-0 edge cases.
Migrate remaining health-dashboard row handlers to the runReviewAction
helper so every failure surfaces a sonner toast with a Retry action
instead of an inline red <p> (which was already removed for the broken-
links row). applyFix() on types.ts now takes a human label and pipes
through runReviewAction.

Migrated sites:
- BoilerplateAtomRow: re-embed + apply-strip
- NoSourceRow: save-source + mark-intentional
- TagRootlessRow: move-under + dismiss
- HealthReviewModal PairRow: keep_a / keep_b / merge_with_edited_content
  (error useState removed; inline error display deleted)
- HealthReviewModal ContradictionRow: defer + summary
- HealthReviewModal TagHealthSection: mergeInto, ignorePair,
  delete/merge/dismiss single-atom tags
- HealthWidget.undoLastFix

Updated NoSourceRow test to assert on sonner.error mock instead of
in-DOM red text.

Verification: tsc clean, 91/91 vitest pass, cargo check clean.
…adiction pairs

Add three new LLM-powered health fix functions:

- verify_overlap_pair: ask LLM if two flagged atoms are truly duplicate or a
  false positive; auto-dismisses under content_overlap on false positive
- verify_contradiction_pair: ask LLM if two flagged atoms truly contradict;
  auto-dismisses under contradiction_detection on false positive
- merge_contradicting_pair: reconcile two contradicting atoms into one
  document using wiki_model; writes to newer atom, deletes older, dismisses pair

Route wiring (reuse existing /api/health/fix/{check}/{item_id}):
- (content_overlap|duplicate_detection, verify_with_llm)
- (contradiction_detection, verify_with_llm)
- (contradiction_detection, merge_with_llm)

New batch route POST /api/health/verify/{check} with handler verify_batch_handler.
Accepts {item_ids, max?}, returns {checked, kept, dismissed_ids}.

Command map: health_verify_batch

Tests: three wiremock-based unit tests in health::llm_tests covering
verify_overlap_pair false-positive, verify_contradiction_pair false-positive,
and merge_contradicting_pair dry-run path.
Add auto_resolve_broken_link / auto_resolve_all_broken_links to
crates/atomic-core/src/health/llm_fixes.rs:
- Fetch up to 8 candidates via suggest_link_targets (new pub(crate) helper
  that wraps suggest_atoms_by_query_sync)
- Prompt the wiki_model with a structured JSON request (512 tokens)
- Relink if confidence >= 0.6, else Skipped; no candidates -> Removed
- Both outcomes are audit-logged via audit::log_fix
- AutoResolveOutcome (Relinked/Removed/Skipped) + AutoResolveBatchResult
  types added and serde-derived

Server routes (crates/atomic-server/src/routes/):
- New action "auto_resolve" in apply_manual_fix_impl for
  broken_internal_links (link_raw from content field)
- POST /api/health/broken-links/auto-resolve-all -> broken_links_auto_resolve_all
  handler, body {max?: u32} (default 25)

Frontend command map:
- health_broken_links_auto_resolve_all entry added

Also fix pre-existing compile errors in the working tree:
- AtomicCoreError::Serialization -> ::Validation in sqlite/health.rs
- storage().get_latest_tag_proposal_sync() -> get_latest_tag_proposal()
  (method already existed at line 2171 of lib.rs)
- Missing .await on set_setting_sync call in tests.rs

Tests: 3 new integration tests in broken_link_auto_resolve_tests.rs
  - relinks when LLM confidence is 0.9
  - skips when LLM confidence is 0.3
  - removes (or skips) when no candidates found
All pass.
…orage wrappers

- health/mod.rs: TagProposal + TagProposalAction enum (merge/rename/reparent/delete)
- db.rs: V19 migration creates tag_proposals(id, summary, actions_json, created_at, applied_at)
- storage/sqlite/health.rs: save/get/get_latest/mark_applied impl methods
- storage/mod.rs: async StorageBackend wrappers for the four proposal ops
- lib.rs: pub get_latest_tag_proposal() surface method
- health/llm_fixes.rs: propose_tag_restructure() + apply_tag_proposal()
- routes/health.rs: create_tag_proposal / apply_tag_proposal / get_latest_tag_proposal handlers
- routes/mod.rs: three new routes (POST /health/tag-proposal, GET .../latest, POST .../apply)
- command-map.ts: health_tag_proposal_create / health_tag_proposal_latest / health_tag_proposal_apply
- tests.rs: test_propose_tag_restructure_parses_and_persists (wiremock, parse+persist round-trip)
## New /health page
- Nav button (HeartPulse) in main top bar between Dashboard and Atoms
- /health route wired through ViewMode, viewPath, parseLocation
- HealthPage with Overview & Review Queue + Tag Structure tabs
- Full HealthPanel content lifted into page (no modal needed)

## Dashboard card replaced
- Old full-width HealthPanel swapped for compact HealthSummaryCard
- Shows score, issues-to-review count, auto-fixable count
- 'Open Knowledge Health \u2192' button switches to /health
- Live-updates via health-updated WS events

## LLM-powered actions wired end-to-end
- Content overlap: per-pair 'Verify with LLM' + tab-level 'Verify all pairs (LLM)'
  False positives dismissed automatically by backend (via Worker A)
- Contradiction detection: per-pair 'Verify (LLM)' + primary 'Resolve (LLM)'
  Resolve merges both atoms via LLM using reconciliation prompt
- Broken links: per-link 'Auto-fix (LLM)' + tab-level 'Auto-fix all broken links'
  LLM picks best suggest target or decides to remove; batch variant returns
  counts (relinked/removed/skipped) surfaced in toast
- Tag structure: full TagStructureTab with propose/apply workflow
  Checkbox per action, select-all, virtualized for >20 actions, sticky apply bar
  Uses health_tag_proposal_create/_latest/_apply from Worker B

## BrokenLinksSection rewrite (fixes cramped layout)
- Per-atom card with atom title header + per-link rows
- Generous padding; 4-button row: Auto-fix (LLM) | Link to\u2026 | Remove | Ignore
- Inline suggest picker with debounced search
- Toast summary on batch auto-fix

## Tests
- HealthPage: 4 tests (nav, tabs, rendering)
- TagStructureTab: 6 tests (empty state, generate, apply)
- BrokenLinksSection: 10 tests total (remove, ignore, relink, picker, auto-fix, auto-fix-all)
- PairRow: 10 tests (Keep A/B, Merge with LLM, Verify, Merge edited)

Verification: tsc clean, 106/106 vitest, 56/56 cargo health tests.
Models often wrap structured output in ```json fences (even when the
prompt says to return bare JSON). serde_json then fails with
'expected value at line 1 column 1' and we show a validation error
to the user.

Add strip_llm_json_fences() helper that:
- trims whitespace
- strips ```json / ```JSON / ``` fence prefixes + suffixes
- falls back to extracting the outermost {...} or [...] block
  if the model wraps the JSON in prose

Apply it to all 4 JSON parse sites in health/llm_fixes.rs:
- verify_overlap_pair
- verify_contradiction_pair
- auto_resolve_broken_link
- propose_tag_restructure

Tests (8 new, 64 total health tests pass):
- 4 unit tests for strip_llm_json_fences (plain, fenced, bare,
  prose-wrapped, roundtrip-parse)
- 3 regression tests with fenced wiremock responses for
  verify_overlap, verify_contradiction, propose_tag_restructure
…m checklist

Addresses remaining items from the health-dashboard UI improvement plan.

Backend:
- boilerplate_pollution now scores 100 - min(3*count, 50) clamped to [50,100]
  so the row score reflects detected issues instead of always 100. Still
  excluded from overall KB weights; test updated accordingly.

UI:
- HealthCheckRow: trend/severity tooltips surface the full text
  ("Was 55, now 60 (+5 since last scan)", "Warning — score 41–70").
  Critical (<41) dots pulse. New props: previousScore, lastCheckedAt,
  disableRun (dims ▶ during global scan), justUpdated (row flash),
  examples (up to 2 shown in expanded body).
- HealthConfirmModal rewritten as per-category checkbox list with dynamic
  Apply button label, disabled manual-only subsection, Esc/Enter bindings.
- HealthWidget:
  - tracks sessionStartScore, lastCheckedAt, recentlyUpdated (flash set),
    globalScanInFlight (disables per-row ▶ during refresh)
  - runSingleCheck stamps timestamp and flashes row for 1.2s
  - shows score delta "+N today" next to overall score
  - filter bar shows "Showing N of M categories" counter (aria-live)
  - manualOnlyCategories() + extractExamples() helpers feed the new UI
  - applyFix(selectedChecks?) receives explicit subset from modal

Tests: 64/64 cargo health, 17/17 vitest files, tsc clean.
…sion

Responds to user feedback that the prior health heuristics were too opinionated:
'not all knowledge bases want wikis per tag; contradictions aren't inherently
bad; long atoms aren't lower quality; no source isn't an inherent problem;
automated fixes shouldn't mutate books or studies.'

Four shifts:

## 1. Sane defaults — opinionated checks are informational

New HealthCheckResult.informational field. When true:
  * the check still runs and is displayed, but contributes 0 weight to the
    overall score unless the user opts in.
  * marks wiki_coverage, content_quality, contradiction_detection,
    boilerplate_pollution.

Rebalanced CHECK_WEIGHTS to the universal-integrity checks only:
  embedding_coverage 0.20, tagging_coverage 0.20, orphan_tags 0.15,
  source_uniqueness 0.10, semantic_graph_freshness 0.10, tag_health 0.10,
  broken_internal_links 0.10, content_overlap 0.05.

## 2. Per-DB HealthConfig

New HealthConfig { overrides: Map<String, HealthCheckOverride { enabled,
weight }> } stored as JSON under the per-DB 'health_config' setting key
(NOT registry — AGENTS.md § Multi-DB Gotchas). aggregate_score takes
Option<&HealthConfig>; compute_health loads it and filters+weights
accordingly. Settings → new 'Configure' tab in /health with per-check
enable toggles + weight overrides + 'Reset to defaults'. Informational
checks are flagged in the UI so users know what they're opting in to.

Routes: GET/PUT /api/health/config.

## 3. Locked atoms — lockable source-of-truth material

V20 migration: atoms.is_locked BOOLEAN DEFAULT 0. Atom struct carries it
(#[serde(default)] preserves forward-compat). ATOM_COLUMNS/COLUMNS_A
include it; atom_from_row reads with i64→bool coerce.

core.set_atom_locked / is_atom_locked — direct SqliteStorage ops via
as_sqlite(); Postgres returns Configuration error (matches existing
pattern for obsidian-import).

Fix paths that mutate content now guard on is_locked:
  * strip_boilerplate_atom → returns Validation('locked')
  * merge_contradicting_pair → returns Validation('one or both atoms locked')
  * auto_resolve_broken_link → returns Skipped { reason: 'locked' } so
    batch flows continue past locked atoms instead of failing.
  * fix_source_uniqueness → skip-with-log, keepers preserved.

UI: new lock toggle next to the delete button in AtomReader. Icon switches
Lock/LockOpen; color+tooltip signal state. aria-pressed for a11y.

Route: POST /api/atoms/{id}/lock { locked: bool }.

## 4. Wiki tag exclusion (deterministic retrieval-layer filter)

Per feedback: 'a more deterministic way than altering the wiki prompt.
Settings-level list where tags are excluded from the wiki model even
accessing the atoms.'

New per-DB setting 'wiki_excluded_tag_ids' (JSON array). core wrapper
get/set_wiki_excluded_tag_ids.

batch_fetch_chunk_details_filtered(conn, chunk_ids, excluded_tag_ids)
drops chunks whose atom is tagged with any excluded tag — done post-fetch
so filter survives pagination. Wired into
sqlite/wiki.rs::get_wiki_source_chunks_sync and get_wiki_update_chunks_sync;
both load excluded_tag_ids from settings and forward to
select_chunks_by_centroid (new trailing &[String] param). LLM never sees
excluded-tagged content, regardless of wiki prompt.

UI: WikiExclusionPanel in /health → Configure tab. Tag picker via existing
TagSelector, round-trips ids through the API. Toast + save button.

Routes: GET/PUT /api/wiki/excluded-tags { tag_ids: string[] }.

## Tests

+13 cargo tests:
  - test_aggregate_score_excludes_informational_by_default
  - test_aggregate_score_config_lifts_informational_into_scoring
  - test_aggregate_score_config_disabled_check_is_skipped
  - test_set_atom_locked_persists_and_roundtrips
  - test_strip_boilerplate_atom_refuses_locked
  - test_merge_contradicting_pair_refuses_when_either_locked
  - test_auto_resolve_broken_link_skips_locked
  - test_wiki_excluded_tag_ids_roundtrip
  - test_batch_fetch_chunk_details_filtered_drops_excluded_atoms
  - boilerplate_pollution: score-reflects-issues assertions (prior commit)

100/100 cargo wiki+health · 106/106 vitest · tsc clean.
… checks

Continuing feedback loop on knowledge health epistemology:

  > 'a small box that just says what the summary of the health check is on
  >  the dashboard, and then if you click it you get sent to the view'
  > 'a tab in settings where you define the tests/configure testing behavior'
  > 'some tests that are enabled by default, and some others that are not,
  >  as well as the ability to add custom tests'
  > 'making these health metrics generalizable enough across the various
  >  ways people use atomic'

Three pieces:

Entire card is now a single button navigating to /health. Compact horizontal
layout: icon · label+status · score/100+summary · chevron. No redundant
'View details' CTA. Uses aria-label for the composite score+status so screen
readers get a coherent announcement.

New 'Health' tab in Settings alongside General/AI/Tags/Connection/
Integrations/Databases. Renders two sections:

  * Knowledge Health Checks — embeds HealthConfigTab (per-check enable +
    weight overrides). Previously only reachable via /health > Configure;
    now surfaced in the main settings flow where users expect preferences
    to live.
  * Custom Checks — embeds the new CustomChecksPanel (below).

SETTINGS_TABS registry gains id 'health'; SettingsModal handles that arm
with the embedded sub-components. Zero duplication — HealthConfigTab is
reused, not reimplemented.

Users can now declare health rules that reflect their own conventions
without shipping arbitrary SQL or JS from the UI.

Structural enum (NOT free-form query) — every rule variant has a
hard-coded Rust evaluator. Keeps SQL injection and resource-exhaustion
risk bounded: the UI controls parameters, never query shape.

  enum CustomRule {
    TagRequires   { any_of: Vec<String>, required: Vec<String> },
    RequireSource { tag_filter: Option<String> },
    ContentRegex  { pattern: String, invert: bool },
  }

  struct CustomCheck {
    id, label, description, enabled, weight, rule,
  }

Example uses:
  * 'papers must have a source' → TagRequires { any_of: [paper], required: [sourced] }
  * 'every atom tagged "paper" needs a source_url' → RequireSource { tag_filter: paper }
  * 'flag atoms with TODO markers' → ContentRegex { pattern: TODO, invert: false }
  * 'every atom must match a publication regex' → ContentRegex { invert: true }

Weight semantics mirror built-in checks: 0 = informational (displayed
but not scored); > 0 contributes at that weight alongside built-ins. The
compute_health pipeline feeds each custom check's weight back into the
effective HealthConfig before aggregate_score runs, so user-controlled
score contributions survive round-trip through the scoring function.

JSON array under per-DB setting key 'custom_health_checks' (NOT registry
— see AGENTS.md § Multi-DB Gotchas). Two databases can have entirely
different rule sets.

AtomicCore methods: get_custom_health_checks(), set_custom_health_checks().
Routes: GET/PUT /api/health/custom-checks.

  * Regex patterns capped at 512 chars; size_limit + dfa_size_limit on
    the compiled DFA prevent pathological backtracking.
  * Flagged atom lists bounded at MAX_FLAGGED = 500 per check.
  * Tag-requires uses a single SELECT to load all (atom_id, tag_id) pairs
    for the candidate set; bucket in HashMap — O(N) in result rows, not
    per-atom queries.
  * require_source has split branches for tagged/untagged to sidestep
    rusqlite's param_from_iter typing complexity.
  * Compute pipeline catches custom-check failures and logs
    (tracing::warn) — a bad rule never takes down the built-in report.

Full CRUD in a single embedded component:
  * Add, delete, label, description, enable toggle, weight slider.
  * Rule-kind selector with contextual param editors:
    - RequireSource: single tag dropdown (optional)
    - TagRequires: two TagMultiPicker fields (any_of + required)
    - ContentRegex: pattern + invert checkbox
  * Weight annotation: '0 = informational' vs 'contributes N% alongside
    built-ins'.
  * Dirty tracking — Save button enabled only when changes exist.
  * Round-trip clean: weight clamped to [0,1]; blank labels default to
    'Unnamed check'.

V19 → V20 (is_locked column) was a naked ALTER TABLE that exploded in
tests that reset user_version to 14 on an already-migrated DB ('duplicate
column: is_locked'). Switched to the pattern already used by V16 → V17
for content_hash: let _ = ALTER, then bump PRAGMA user_version. Restores
test_v15_* migration integrity tests.

+5 cargo tests in health/custom.rs:
  * require_source_flags_atoms_without_url
  * tag_requires_flags_atoms_missing_required_tag
  * content_regex_with_invert_flags_atoms_not_matching
  * disabled_checks_are_skipped
  * zero_weight_produces_informational_result

Verification: 325/325 cargo (atomic-core) · 106/106 vitest · tsc clean.
Expands CustomRule from 3 shapes to 11, covering the rule patterns users
most commonly want in a KB-health system. Every variant follows the same
safety model as the shipped three: structural params, hard-coded Rust
evaluator, bounded output, no free-form query.

## Tier 1 (generalizable, high user value)

  RequireTag        any_of, tag_filter?
  ContentLength     min_words, max_words, tag_filter?
  CitationCount     min_citations, tag_filter?
  SourceDomainMatches domains, mode=allow|block, tag_filter?
  StaleAtom         tag, max_age_days

## Tier 2 (nice to have)

  ForbiddenTagCombo all_of
  MissingHeading    min_length_chars, tag_filter?
  TagCardinality    min, max, tag_filter?

## Evaluator patterns

  * Shared helpers to prevent lifetime/borrow gymnastics:
    - load_candidates_id_content(conn, tag_filter) \u2014 returns Vec<(id, content)>
    - for_each_atom(conn, tag_filter, cols, closure) \u2014 streaming visitor
    - push_flag(flagged, id, content) \u2014 MAX_FLAGGED cap enforcement
  * CitationCount uses a byte-level scan (no regex alloc per atom) to count
    both  and  forms.
  * SourceDomainMatches parses host with a cheap tokenizer (split on ://,
    /, ?, #), normalises case, supports suffix match so
    catches . Atoms without a source_url are skipped \u2014
    use RequireSource to police presence separately.
  * StaleAtom uses RFC3339 lexicographic comparison on the cutoff string,
    no SQLite date-function dependency.
  * ForbiddenTagCombo uses one aggregate SELECT with HAVING COUNT =
    all_of.len() to atom-match the combo.

## UI (CustomChecksPanel)

  * Rule picker now grouped into Tags / Sources / Content / Workflow
    optgroups for readability.
  * New reusable sub-components: NumberField, TagFilterRow,
    DomainListInput.
  * Each new rule kind has a bespoke editor with inline validation hints.
  * Rule creation uses sane defaults (min_citations: 1, max_age_days: 14,
    min_length_chars: 120, allowlist mode) so new checks are usable on
    save.

## Tests

+9 cargo tests in health/custom.rs:
  * require_tag_flags_untagged_atoms
  * content_length_flags_too_short_and_too_long
  * citation_count_flags_atoms_with_too_few_links
  * source_domain_allowlist_flags_off_list_domains
  * source_domain_blocklist_flags_on_list_domains
  * stale_atom_flags_old_tagged_atoms
  * forbidden_combo_flags_atoms_carrying_all_forbidden_tags
  * missing_heading_flags_long_atoms_without_heading
  * tag_cardinality_flags_over_and_under_tagged

Verification: 334/334 cargo (atomic-core), 106/106 vitest, tsc clean.
Adds the missing coverage layers identified in turn-end review:

- Core integration (crates/atomic-core/tests/health_custom_integration.rs):
  8 tests verifying compute_health actually wires custom checks in.
  Covers round-trip persistence, prefixed key emission, zero-weight
  informational semantics, positive-weight scoring propagation, disabled
  skip path, key collision with built-ins, multi-rule independence,
  and malformed-regex fault isolation.

- Server routes (crates/atomic-server/tests/api_health_custom_checks.rs):
  4 tests exercising GET/PUT /api/health/custom-checks through the real
  actix test service: default empty, round-trip, overwrite semantics,
  auth enforcement.

- Frontend (src/components/health/__tests__/CustomChecksPanel.test.tsx):
  7 tests for the panel: loads on mount, empty state, add/delete,
  weight clamping + label sanitization on save, save-failure toast,
  rule-kind switch rewrites shape.

Net: +19 tests, all green. 475/475 cargo, 113/113 vitest, tsc clean.
Users can now dry-run a custom rule before saving it and see how many
atoms it would flag. Fixes the trust problem with parameter tuning:
"is min_citations=3 right, or should it be 5?" now has a one-click
answer.

## Backend

  POST /api/health/custom-checks/preview
  body:  { rule: CustomRule }
  200:   { total_considered, flagged_count, sample: Atom[] }

- atomic_core::health::custom::preview_rule: evaluates one rule without
  persisting it. Sample capped at PREVIEW_SAMPLE=10.
- AtomicCore::preview_custom_health_check: facade. Errors (e.g.
  invalid regex) surface to the caller instead of being swallowed like
  in compute_health (report must always succeed; preview can fail).

## UI

- PreviewRow component per check card: Run preview button + inline
  result (N flagged of M total + first 10 titles) or error.
- Switching rule kind invalidates the preview so stale counts can not
  mislead.

## Tests

+3 cargo integration tests:
  - preview_reports_counts_and_sample_without_persisting
  - preview_sample_capped_at_ten
  - preview_surfaces_malformed_regex_as_error

+2 server route tests:
  - preview_returns_counts (verifies counts + non-persistence)
  - preview_returns_error_for_malformed_regex

+2 vitest:
  - Run preview invokes command and shows counts
  - Preview error surfaces inline, no toast (inline is less intrusive
    while tuning than a toast that disappears).

Verification: 480/480 cargo, 115/115 vitest, tsc clean.
Review caught that HealthPage rendered its own h1 + Refresh button
AND HealthPanel rendered an h3 "Knowledge Health" + Refresh button
40px below. Two identical "Refresh health checks" accessible names,
two titles stacked.

HealthPanel now accepts `hideTitle` (defaults false so the dashboard
card is unchanged). HealthPage passes it and drops its own refresh
button — the panel already owns refresh, export, and help controls,
and remounting on refreshKey was redundant with the panel's
fetchHealth callback.

Net UX: one heading, one Refresh button, same functionality.

Test: HealthPage.refresh-button-present rewritten to assert the
hideTitle prop is threaded. 115/115 vitest, tsc clean.
Upstream landed V16 (autotag_description column) while this branch
also claimed V16 (knowledge health tables) and subsequent versions
through V20. Shift all health migrations up by one:

  V16 health_reports/health_fix_log      -> V17
  V17 atom_chunks.content_hash           -> V18
  V18 health_dismissals                  -> V19
  V19 tag_proposals                      -> V20
  V20 atoms.is_locked                    -> V21

Bump LATEST_VERSION constant to 21. cargo check -p atomic-core
-p atomic-server passes clean. No production DBs touched; only
dev machines that ran the pre-rebase branch need `rm databases/*.db`.
Four changes bundled here, none behavioral:

1. Postgres storage arms \u2192 explicit errors (storage/mod.rs)
   AGENTS.md states the Postgres backend "implements the same storage
   traits". Previously, 20 health-dispatch sync helpers returned
   Ok(empty)/Ok(0)/Ok(None) on the Postgres arm, which silently
   produced empty reports, dropped fix writes, and discarded
   dismissals when running against Postgres \u2014 a contract violation.
   They now return a DatabaseOperation error identifying the
   unimplemented subsystem. Three helpers (boilerplate hash counts,
   vec_chunks delete, content_hash backfill) keep their empty-result
   fallback because "no boilerplate detected" is a correct no-op for
   a storage without the content_hash column; those now carry a
   comment explaining why they deviate.

2. HealthPanel \u2192 components/health/ (src/components/)
   HealthWidget.tsx (849 lines, exported HealthPanel) lived under
   dashboard/widgets/ but was only consumed by the /health page \u2014
   the dashboard itself uses a separate HealthSummaryCard. File
   moved to components/health/HealthPanel.tsx; HealthPage import +
   test mock updated. No behavior change.

3. health/mod.rs split (crates/atomic-core/src/health/)
   mod.rs was 1120 lines mixing types, score aggregation, and
   orchestration. Extracted:
     - types.rs (350L): all public health data types, config types,
       and tag-proposal enums.
     - score.rs (74L): CHECK_WEIGHTS + aggregate_score().
   mod.rs is now 736 lines focused on compute_health / run_fix /
   compute_single_check. Public API preserved via re-exports so
   `use crate::health::HealthReport` still works.

4. Root-level audit docs relocated
   docs/health-review-queue-audit.md \u2192
     docs/plans/2026-05-01-health-review-queue-audit/audit.md
   docs/plans/frontend-health-audit.md \u2192
     docs/plans/2026-05-01-frontend-health-audit/audit.md
   matches existing dated-plan-directory convention.

Verification:
  cargo check -p atomic-core -p atomic-server         clean
  cargo check -p atomic-core --features postgres      clean
  cargo test  -p atomic-core                          337 passed
  npx tsc --noEmit                                    clean
  npx vitest run                                      115 passed
Minor quality-of-life pass on the health modules. No behavior change.

 - custom.rs  \u2014 std::iter::repeat(...).take(n) \u2192 std::iter::repeat_n(...)
                 in five SQL-placeholder builders.
 - llm_fixes.rs \u2014 Some(&[id.clone()]) \u2192 Some(std::slice::from_ref(&id))
                 at four log_fix call sites (avoids a single-element Vec alloc
                 per fix record).
 - link_resolution.rs \u2014 starts_with + slice \u2192 strip_prefix; trailing `.md`
                 matcher \u2192 strip_suffix; rename unused `close` capture to
                 `_close` (still acts as a guard that `)` exists).
 - mod.rs run_fix \u2014 collapse `if A { if B { ... } }` \u2192 single `&&` guard.
 - types.rs \u2014 remove unused `FixTier::from_str` (no callers; avoided clash
                 with the std::str::FromStr trait name).
 - audit.rs \u2014 `#[allow(clippy::too_many_arguments)]` on `log_fix` (10
                 params, 25 call sites; refactor to builder deferred).
 - storage/sqlite/health.rs \u2014 drop two genuinely-dead let-bindings
                 (`_direct`, `_no_ext`) that were shadowed immediately.

Verification:
  cargo check  -p atomic-core -p atomic-server                 clean
  cargo check  -p atomic-core -p atomic-server --features postgres  clean
  cargo test   -p atomic-core -p atomic-server                 445 passed (337 core + 108 server)
  cargo clippy -p atomic-core --lib --no-deps                  only pre-existing warnings
  npx tsc --noEmit                                             clean
  npx vitest run                                               115 passed
bk-ty added 21 commits May 3, 2026 12:07
Organizational polish pass. Pure relocation — no behavior change,
public API unchanged.

health/mod.rs (732 \u2192 69 lines)
  \u2192 compute.rs (522)  \u2014 compute_health, compute_single_check, store_report,
                           compute_link_check, apply_dismissals, BrokenLinkItem
  \u2192 run_fix.rs (173)  \u2014 run_fix orchestrator
  mod.rs now holds module decls, re-exports, and the tiny pair_key helper.

health/custom.rs (1347 lines) \u2192 health/custom/ directory
  \u2192 mod.rs (183)      \u2014 orchestrator (run_all, preview_rule), evaluate
                           dispatch, finalize, module re-exports
  \u2192 types.rs (160)    \u2014 CustomRule, DomainMatchMode, CustomCheck,
                           PreviewResult, RawOutcome, FlaggedAtom
  \u2192 helpers.rs (158)  \u2014 shared preview/word_count/count_citations/host_of/
                           host_matches/for_each_atom/push_flag/load_candidates
  \u2192 rules.rs (472)    \u2014 all 11 eval_* fns (one per CustomRule variant)
  \u2192 tests.rs (412)    \u2014 existing #[cfg(test)] block moved verbatim

HealthPanel.tsx (849 \u2192 534 lines)
  \u2192 healthPanel.helpers.ts (328) \u2014 pure helpers + constants + filter types
                                      (HealthReport, CHECK_LABELS,
                                      CHECK_DESCRIPTIONS, FIX_ACTION_LABELS,
                                      STATUS_COLORS, CHECK_ORDER,
                                      DEFAULT_FILTER, SeverityFilter/
                                      FixableFilter/SortOrder, FilterState,
                                      pendingActions, manualOnlyCategories,
                                      extractExamples, extractCount,
                                      reviewItems, getSeverityBadge,
                                      getVisibleChecks)
  \u2192 ScoreBar.tsx (15)            \u2014 small presentational score meter
  HealthPanel.tsx now holds state, effects, keyboard shortcuts, and the
  JSX for the main panel \u2014 no more pure-helper soup at the top.

Deferred (not in this pass):
  - llm_fixes.rs (1255 lines) \u2014 split by action kind (merge/contradictions/
    overlap/tagging/tag-tree/boilerplate/links). Cohesive theme, already
    banner-organized, fine-grained public API; risky to split in a polish
    commit. Candidate for a v2 pass.
  - Crate-root audit \u2014 the only health-PR addition at atomic-core/src/
    root is boilerplate.rs, which is a cross-cutting content-fingerprinting
    utility shared by embedding, chunks, storage, and health. Crate root
    is the correct location.

Verification:
  cargo check  -p atomic-core -p atomic-server                       clean
  cargo check  -p atomic-core -p atomic-server --features postgres   clean
  cargo test   -p atomic-core -p atomic-server                       483 passed
  npx tsc --noEmit                                                   clean
  npx vitest run                                                     115 / 115
Three health LLM fixes had settings keys registered in DEFAULT_SETTINGS
but still hardcoded their prompts inline. Thread the keys through so users
can tune them per-DB (empty = fall back to the builtin default):

  - merge_duplicate_pair         \u2192 health.merge_duplicates_prompt
  - contradiction_summary        \u2192 health.contradiction_detection_prompt
  - strip_boilerplate_atom       \u2192 health.strip_boilerplate_prompt  (new)

Design: split each prompt into an *instruction* (user-tunable) and a
*data block* (hardcoded format with interpolated atom content). Override
replaces only the instruction \u2014 placeholder names stay internal so a
misspelled setting can't break the fix. Shared resolver handles the
empty-string-means-default case.

Also remove three keys that were registered without an implementation:

  - health.split_long_atom_prompt
  - health.enrich_stub_atom_prompt
  - health.add_structure_prompt

Those referenced fixes that never shipped; leaving them in the settings
table pollutes the settings UI with options that silently do nothing.
Re-add when the corresponding fixes land.

Verification:
  cargo check  -p atomic-core -p atomic-server                       clean
  cargo check  -p atomic-core -p atomic-server --features postgres   clean
  cargo test   -p atomic-core                                        337 passed
  cargo test   -p atomic-server                                      63 passed
  cargo test   -p atomic-core --tests                                397 passed
The backend wired three health.*_prompt settings keys in the previous
commit, but the Settings > Prompts tab only rendered the 5 existing
overrides (wiki gen/update, briefing, chat, tagging). Add the three new
ones under a "Health Check Prompts" section so users can actually tune
them without editing the SQLite settings row by hand:

  - Merge Duplicates Prompt         (health.merge_duplicates_prompt)
  - Contradiction Detection Prompt  (health.contradiction_detection_prompt)
  - Strip Boilerplate Prompt        (health.strip_boilerplate_prompt)

Each follows the same shape as the other prompt fields:
  - useState + settings-load hook
  - textarea with placeholder showing the builtin default
  - autoSave on blur
  - Reset to default button when value non-empty
  - OverrideControls for per-DB override visibility

Section is visually separated with a top border + subheading so the
health group is legible from the global (wiki/briefing/chat/tagging)
group above it.

Verified: npx tsc --noEmit clean.
Move hardcoded health-check detection thresholds from inline SQL/Rust
constants into a `HealthThresholds` struct persisted under the existing
per-DB `health_config` setting. Every threshold has a baked-in default
via `Default` so existing DBs continue to work without migration.

Thresholds now tunable:
- boilerplate_pollution: similarity (0.99), min_clones (2)
- contradiction_detection: similarity_min (0.80), similarity_max (0.92),
  shared_tags_min (1)
- content_overlap: similarity_min (0.55), similarity_max (0.85),
  shared_tags_min (2)
- content_quality: short_chars (100), long_chars (15000)
- wiki_coverage: min_atoms_per_tag (5)
- tag_health: single_atom_tag_threshold (3)
- semantic_graph_freshness: warning_window (20 atoms)

Changes:
- types.rs: add HealthThresholds + per-field serde defaults
- storage/sqlite/health.rs: thread thresholds into SQL via bind params
- checks.rs: semantic_graph_freshness / tag_health / boilerplate_pollution
  now take &HealthThresholds; run_all accepts them too
- compute.rs + run_fix.rs: resolve config once, forward thresholds
- HealthConfigTab.tsx: new Detection Thresholds panel with per-field
  number input, placeholder default, and per-field reset button

No API surface changes \u2014 HealthConfig round-trips the extra field
through the existing get_health_config / set_health_config endpoints.
Save button previously rendered above the thresholds panel, so users
editing thresholds had to scroll up to save. Move the Reset/Save row
into a sticky footer below Thresholds + WikiExclusion so it is always
reachable from the currently-edited field.
Reject payloads where the math would misbehave:
- similarities outside [0,1] or NaN
- inverted contradiction window (min >= max)
- content_overlap min > max (min == max still allowed; inclusive bounds)
- negative integer counts
- wiki_min_atoms_per_tag < 1 (would mark every tag wiki-eligible)
- content_quality_short_chars >= long_chars (every atom flagged as both)

Rules live on HealthThresholds::validate() and return a Vec<String> of
human-readable errors. set_health_config() concatenates them into a
single AtomicCoreError::Validation, which the server already maps to
HTTP 400 with a structured { error } body.

Adds:
- 7 HealthThresholds::validate() unit tests
- 5 PUT /api/health/config integration tests (defaults, round-trip,
  similarity-out-of-range, inverted window, partial payload forward-compat)
Hermetic integration test: seed 3 atoms of lengths 50 / 150 / 50_000,
persist HealthConfig with strict (short<200, long>40_000) thresholds
and assert content_quality flags >= 3, then relax to (short<20,
long>100_000) and assert flagged count drops below the strict count.

Proves the thresholds reach the SQL via bind params rather than being
silently ignored.
…holds"

Previous sticky-footer Save looked detached and competed visually with
the two adjacent section-scoped saves ("Save exclusions" in the wiki
panel, "Save changes" in custom checks). Move it back into the Check
Weights header row next to the total-effective-weight line, drop the
sticky wrapper, and relabel to "Save checks & thresholds" so its scope
is obvious relative to the other two saves.
Drops the explicit Save button from the Check Weights / Thresholds
section and replaces it with a debounced autosave + inline status pill
(Autosaves / Saving\u2026 / Saved / Save failed). Matches the pattern used in
other settings panels; aligns with the two sibling panels which keep
their own explicit saves because they have preview/apply workflows
(WikiExclusion, CustomChecks).

Mechanics:
- 600ms debounce on any draft change (weight, enabled, threshold value)
- Dedupes via JSON-string comparison with last successful payload
- Skips autosave for the initial load draft
- Errors surface both in the pill and as a toast with retry
- \"Saved\" state fades back to idle after 1.5s to reduce visual noise
- Reset button still works \u2014 it writes defaults into draft, debounce then
  autosaves
Root cause: HealthConfigTab is mounted conditionally inside SettingsModal
(`activeTab === 'health' && <HealthConfigTab />`). Switching to another
settings tab unmounts the component, which cancels the 600ms autosave
debounce timer before it ever fires \u2014 so any edit made within 600ms of
a tab switch (or modal close) was silently discarded.

Fix: add a mount-time cleanup effect that on unmount serializes the
current draft, compares it against the last saved payload, and fires a
best-effort PUT /api/health/config if different. Dedupe keeps the PUT
from re-sending an already-saved payload. Errors still reach the toast
store; we skip the status pill because the component is going away.

Also: strip the debug console.log added earlier.
Pill was only rendered in the check-weights footer row, directly above
the ThresholdsPanel. Users editing fields further down never saw the
"Saving\u2026 / Saved" feedback \u2014 it was scrolled off-screen. Surface a
second pill in the thresholds header so feedback stays visible no
matter which section the user is editing.

Also temporarily log autosave ticks to console.debug so we can confirm
whether threshold edits are firing PUT requests in the field.
Two pills (one in check-weights footer, one in thresholds header) were
redundant and visually confusing \u2014 same state rendered twice.

Collapse to a single sticky header at the top of the panel with:
  - one SaveStatusPill covering both check-weight and threshold edits
  - one Reset-to-defaults button that reverts both sections
  - a short explainer that the form autosaves

Header sticks to the modal scroll container so feedback stays visible
regardless of which section is being edited. Removes saveStatus prop
from ThresholdsPanel and the "Total effective weight" line is demoted
to a single caption row (no longer competing with pill/button).
Three bugs in the thresholds panel autosave, all fixed in this commit:

1. Phantom "saved" pill on every mount.
   lastSavedRef was seeded with JSON.stringify(server response). But the
   server serializes HealthThresholds fields in alphabetical order while
   the client fromDraft() emits them in DEFAULT_THRESHOLDS declaration
   order. Different key order \u2192 different JSON string \u2192 dedupe thought
   the form had unsaved edits on mount, fired an immediate autosave,
   and showed "Saving\u2026 \u2192 Saved" before the user touched anything.
   Fix: seed lastSavedRef with the result of running the SAME fromDraft
   pipeline autosave will use, so round-trip identity is guaranteed.

2. "Saving\u2026" pill could hang forever.
   If the invoke() promise never resolved (network dropped, server hung,
   WS-proxy stalled), the pill was stuck on "Saving\u2026" and the user had
   no signal the save had failed. Add a 15s watchdog timer that flips
   the pill to an error state; invoke resolution clears it.

3. Overlapping saves could commit stale values.
   A fast editor could trigger a new autosave while the previous PUT
   was still in flight; both PUTs race and whichever lands second wins,
   which might be the older payload. Serialize with an in-flight flag
   and a one-slot trailing queue: during a save, further edits overwrite
   the trailing slot, and the trailing payload fires once the in-flight
   save settles.

Also: strip the temporary debug console.log.

Verified end-to-end with curl against a local atomic-server:
  PUT /api/health/config with boilerplate_similarity=0.77 \u2192 204
  GET /api/health/config \u2192 boilerplate_similarity: 0.77 (persisted)
So the backend is correct. The previous "value reverted on reload"
symptom was almost certainly the phantom mount-save above clobbering
the users edit with stale values.
A prior rebase renumbered migrations and let some DBs tick user_version
past the V15\u2192V16 step without ever running the ALTER that adds
tags.autotag_description. Any query joining that column then fails at
runtime with "no such column: t.autotag_description" \u2014 seen in the
wild when previewing health fixes on an affected DB.

Fix: after the versioned migration pass, run an idempotent column
check that restores any expected column that is missing, regardless
of the recorded schema version. Table-driven so future columns can be
added with one line. Logs a warning when it has to heal \u2014 makes
migration drift visible in sidecar stderr.

Already repaired the user DB out-of-band via a direct ALTER; this
commit prevents the same drift from hurting other users on reopen.
The sticky header with autosave pill + reset button was obstructing
the checks table and added no information the user needed on every
edit. Autosave is reliable now, so the happy-path pill was noise.

Changes:
- Remove sticky header, pill, saveStatus state, and saved->idle fade effect.
- Move "Reset to defaults" to a muted link at the bottom of the panel.
- Keep watchdog + error paths; failures now surface via toast only
  (already the retry channel; no UX regression).
- Drop SaveStatusPill, SaveStatus type, and unused Check/AlertCircle imports.

Net: cleaner config tab, same persistence guarantees.
The contradiction detector was surfacing pairs whose embeddings were
dominated by shared template text (same runtime-env section, same
health-endpoint table, same outage-alert table \u2014 different app). The
92% embedding similarity was real; the "contradiction" was not. User
saw PITVR vs Roster Download flagged as a contradiction candidate
when they are simply two atoms built from the same template.

Two layered filters, both applied in
storage/sqlite/health.rs::collect_raw_for_health:

  (1) Boilerplate-exclusion. If either atom in a candidate pair is
      already in raw.boilerplate_affected_atoms (computed earlier in
      the same pass), the pair is dropped. Boilerplate-polluted atoms
      cannot produce trustworthy semantic signals until their unique
      content is stripped + re-embedded.

  (2) Token-Jaccard cap. Real contradictions express *different*
      claims, so their token sets differ. Pairs whose unique-token
      overlap >= thresholds.contradiction_max_content_jaccard are
      treated as template clones and dropped. Default 0.70, tunable
      via Settings \u2192 Health \u2192 Thresholds.

Implementation details:
- New threshold field HealthThresholds.contradiction_max_content_jaccard
  (serde default 0.70, validated to [0.0, 1.0] with the other sims).
- New helper content_token_jaccard(a, b) \u2014 lowercased alphanumeric
  runs of length >= 3, |A \u2229 B| / |A \u222a B|.
- SQL LIMIT raised from 20 to 200 so the post-filter still has enough
  candidates to fill the 20-pair result cap. raw.contradiction_pairs_checked
  still reports the full pre-filter edge count so the UI can show "of
  N candidates".
- UI threshold exposed in HealthConfigTab under Contradiction detection.
- 7 unit tests covering identity, disjoint, punctuation, short-token
  drop, real-world template-clone (PITVR / Roster), and real-world
  differing-claim cases.
…lters

Real-world PITVR vs Roster Download still surfaced after the first two
filters shipped. Root cause analysis:

- Content token-Jaccard was 0.46 (default cap 0.70) \u2014 passes.
- Neither atom was in boilerplate_affected_atoms (user set
  boilerplate_similarity=0.87, boilerplate_min_clones=2; PITVR has
  1 edge at that similarity, one short of the clone count).
- Embedding sim 0.92, well inside the contradiction window.

Two additional filters:

  (3) Title-token overlap. A pair that shares zero *informative* title
      tokens is almost certainly about different entities, even when
      their vectors land close (template / boilerplate pollution).
      "PITVR" \u2229 "Roster Download" = \u2205. Robust where content-level
      Jaccard is noisy, because H1/title text is short and template-
      free. Stopwords (\"the\", \"and\", \"for\", ...) do not count; tokens
      must be len >= 3.

  (4) Boilerplate-zone ceiling. If the user says
      \"thresholds.boilerplate_similarity\" is the template-clone line,
      honor that as an upper bound for contradictions. Real
      contradictions live below the template plateau; anything at or
      above it is boilerplate noise regardless of tag overlap. This
      directly catches the PITVR pair (sim 0.92 >= 0.87).

Implementation:
- titles_share_token(a, b) helper + 5 unit tests (distinct-entity,
  same-entity, stopwords, punctuation, empty input).
- Relaxed the template-clone Jaccard assertion to 0.40 to reflect
  the measured value on real runbook content (0.47).

UI: no new setting. Filter (3) is always-on heuristic; filter (4)
uses the existing boilerplate_similarity threshold, so adjusting that
one knob tightens both checks together.
Two user-facing bugs plus one data-loss hazard:

1. **Auto-fix (LLM) always concluded "no matches"** and quietly stripped
   every broken link. Root cause: auto_resolve_broken_link passed the
   entire raw link string ("[Glossary](glossary.md)") to the LIKE-based
   fuzzy search, which only matches on titles/source_url. It never hit
   anything. Then the no-candidate branch silently removed the link.
   Data loss, surfaced only via a toast that said "Skipped".

   Fix:
   - New extract_link_target() helper pulls the href or wikilink name.
   - suggest_link_targets() is called on the extracted target; on empty
     result it falls back to the display text (H1 often matches even
     when the path is stale).
   - The no-candidate branch now **skips** instead of removing. Users
     saw "auto-fix concluded with no matches" and found links had been
     destroyed. The explicit Remove button remains the escape hatch.

2. **"Link to..." picker often showed "No matches"** on links like
   `../processes/foo-bar.md#section`. Backend source_urls don\\"t carry
   fragments and the LIKE search doesn\\"t know about slug<->title shape.

   Fix: seed the picker query with a filename stem, not the raw target:
   - strip `#fragment` / `?query`
   - drop directory path + `.md`/`.markdown`/`.mdx` extension
   - replace `-` / `_` with spaces

   `../processes/custom-application-stewardship.md#x`
     \u2192 `custom application stewardship`

   Users can still type anything in the input; this just gives the
   picker a useful default.

3. **Sibling-file markdown links were wrongly flagged broken.** Bare
   `glossary.md` written next to `onboarding.md` resolved to
   `<vault>/glossary.md` only, missing the real sibling target at
   `<vault>/references/glossary.md`. Obsidian-default is current-dir-
   first, vault-root-fallback; we now generate both candidates. Also
   strips `#fragment` before candidate generation so
   `[x](./foo.md#sec)` resolves to `./foo.md`.

Tests:
- 2 new link-resolution unit tests (bare-href dir-relative, fragment
  stripping). 17 total link_resolution tests pass.
- 358 atomic-core lib tests all green.

Rebuilt debug sidecar; restart the Tauri dev app to pick up changes.
Health dashboard rendered "Contradictions 4 → red" next to "0 atom
pairs on the same topic" — a row that argues with itself. Same class
of bug for content_overlap and boilerplate_pollution: apply_dismissals
updated the pair/atom arrays + user-facing count fields but left
HealthCheckResult.score frozen at whatever it was pre-dismissal. The
summary row reads both fields, so once all pairs were dismissed the
score stayed in warning/red territory while the description correctly
said zero.

Fix: after filtering, mirror the per-check score formula from
checks.rs:

  content_overlap        score = max(0, 100 - 8*new_count)
  contradiction_detection score = max(0, 100 - 8*new_count)
  boilerplate_pollution  score = max(50, 100 - 3*new_count)

When the post-dismissal count is zero, also flip status back to "ok"
alongside the existing requires_review=false. This keeps the row
consistent with a fresh compute_health() that started with zero pairs.

New regression test asserts the contradiction case: score 4 + 2 pairs
-> dismiss both -> score 100, status ok. Co-exists with the existing
apply_dismissals tests (6 pass, full lib suite 358 pass).

Sidecar rebuilt via npm run build:server so the Tauri dev app picks up
the new binary on next launch (Cmd-Q + reopen).
Modal counts bug
================
Section header, tab chip, summary card, and visible row list all read
different sources. Dismissing an item only updated the section-local
'removed' Set; tabs[].count and the summary card kept reading the
unchanged report prop. 18 divergent display sites per dashboard.

Fix: single source of truth. Every section's onResolved now flows
through a resolveItem(check, id) reducer in HealthReviewModal that
prunes the affected item from report.checks[check].data and decrements
the matching count field. Section-local Sets are gone. The callback
signature changed from () => void to (id: string) => void across
BoilerplateSection, ContradictionSection, ContentQualitySection,
TagHealthSection, and BrokenLinksSection.

Broken-link resolver scope
==========================
Markdown links like [x](glossary.md) in a source inside references/
were flagged broken even when a sibling atom existed under shared/.
Wikilinks already did a vault-wide LIKE search; markdown links did not.

Add markdown_stem_fallback(href) to link_resolution.rs and wire it into
both compute.rs (detection) and fixes.rs (auto-fix) so a missed exact
match falls back to find_atom_by_wikilink_name_sync with the filename
stem. Mirrors the wikilink path and reuses tested infra.

Tests
=====
- 3 new unit tests for markdown_stem_fallback.
- 2 new integration tests (broken_link_scope_tests.rs): sibling-subdir
  resolve succeeds; truly missing target still flagged.
- All 424 atomic-core tests pass; 10 HealthReviewModal tests pass;
  tsc --noEmit clean.
@bk-ty bk-ty closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant