fix: batch embedding generation + dev tooling (SLOPSCAN 9000) #287

aaronsb · 2026-02-04T02:25:33Z

Summary

Fix silent embedding loss in batch operations — batch_service.py checked result.get("success") but the embedding worker never returns that key, silently discarding every generated embedding
Fix dead test mock targets — batch route constructed BatchService directly instead of using get_batch_service(), so test patches had no effect
Add graceful shutdown — JobQueue.shutdown() on the ABC with PostgreSQLJobQueue implementation that stops the ThreadPoolExecutor and closes the connection pool; stream handler detachment prevents I/O errors during test teardown
New: TODO inventory scanner — make todos scans for TODO/FIXME/HACK/XXX with git blame age, JSON output, and filtering
New: SLOPSCAN 9000 — make slopscan detects agent antipatterns (truncated output, stubs, attribution noise, dummy creds, rubber-stamp approvals)
Add --json to docstring coverage — machine-readable output for pipeline use with jq, claude-code, etc.

Bug details

The embedding worker (embedding_worker.py) returns:

{"embedding": [...], "model": "...", "provider": "...", "dimensions": 384, "tokens": N}

But batch_service.py checked result.get("success") — a key that never existed. This meant every batch embedding generation silently reported failure while the embedding was actually computed and thrown away.

The route at graph.py:76 constructed BatchService(client) inline, so tests patching get_batch_service() were patching dead code. The mock fixture also used the wrong return format ({"success": True, ...}), which mirrored the production bug — tests passed even though production was broken.

Code review findings addressed

Finding	Severity	Resolution
`shutdown()` not on `JobQueue` ABC	Important	Added with no-op default
Unused `timeout` param	Nit	Removed
Missing `unknown` status in JSON	Important	Added to staleness output
Unused `asdict` import	Nit	Removed
Mixed type annotation style	Nit	Consistent `typing` imports
Blame range assumes sorted items	Nit	Explicit sort added

Test plan

make test — 978 passed, 23 skipped, 0 failures
make todos — 42 markers found
make slopscan — agent antipattern scan runs clean
docstring_coverage.py --json | jq .staleness — includes all status categories
Verify batch ingestion generates embeddings in production

…eanup The embedding worker returns {"embedding": [...], "model": ...} but batch_service checked result.get("success"), which was always None — silently discarding every generated embedding. Fix both call sites (concept embeddings and ontology embeddings) to check for the actual "embedding" key. The batch route constructed BatchService directly instead of using the get_batch_service() factory, so test mocks patching that factory had no effect. Fix route to use the factory, and align mock fixtures to match the real embedding worker return contract. Add shutdown() to the JobQueue ABC and implement it on PostgreSQLJobQueue to cleanly stop the ThreadPoolExecutor and close the connection pool. Call it during app shutdown and detach stream handlers afterward so draining worker threads don't crash on closed stderr during test teardown. Fix stale coverage paths in pytest.ini (src -> api/app).

…ctor New tool at scripts/development/lint/todo_inventory.py scans the project for TODO/FIXME/HACK/XXX markers with optional git blame age, JSON output, and marker filtering. The --slop flag enables SLOPSCAN 9000: detection of agent antipatterns including truncated output (...rest of the code), placeholder stubs, AI attribution comments, dummy credentials, and rubber-stamp approvals. Add --json output to docstring_coverage.py for pipeline use with jq and other tools. JSON mode auto-includes staleness data with all status categories (current, stale, unverified, unknown). New Makefile targets: todos, todos-verbose, todos-age, slopscan.

Two-tier severity for staged code files only: - Hard fail (blocks commit): TRUNCATED, STUB - Soft warn (prints, allows): ATTRIB, DUMMY, LGTM Install: git config core.hooksPath .githooks

aaronsb added 3 commits February 3, 2026 20:25

feat: add pre-commit hook for SLOPSCAN 9000

67e226e

Two-tier severity for staged code files only: - Hard fail (blocks commit): TRUNCATED, STUB - Soft warn (prints, allows): ATTRIB, DUMMY, LGTM Install: git config core.hooksPath .githooks

aaronsb merged commit 2ab6fc6 into main Feb 4, 2026
3 checks passed

aaronsb deleted the fix/batch-test-embedding-failure branch February 4, 2026 02:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: batch embedding generation + dev tooling (SLOPSCAN 9000) #287

fix: batch embedding generation + dev tooling (SLOPSCAN 9000) #287

Uh oh!

aaronsb commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: batch embedding generation + dev tooling (SLOPSCAN 9000) #287

fix: batch embedding generation + dev tooling (SLOPSCAN 9000) #287

Uh oh!

Conversation

aaronsb commented Feb 4, 2026

Summary

Bug details

Code review findings addressed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants