node -e "const m=require('./backend/agents/manager.cjs'); m.autoRoute({brief:'semantic search over workspace about TruthLens'}).then(console.log)"
node -e "const m=require('./backend/agents/manager.cjs'); m.run('jina',{query:'truthlens'}).then(console.log)" node -e "const m=require('./backend/agents/manager.cjs'); m.run('infranodus',{topic:'content gap demo',texts:['a','b']}).then(console.log)" node -e "const m=require('./backend/agents/manager.cjs'); m.run('scraper',{url:'https://bbc.co.uk/'}).then(console.log)"
---
## Where things live
* **Research** → `workspace/research/<id>-<slug>/`
* **Specs (generated)** → `backend/.agent-os/specs/<ts>-*/spec.md`
* **Index** → `workspace/data/index.json`
* **Knowledge & rules**
* `workspace/knowledge/TruthLens.md`
* `workspace/knowledge/UK-Online-Compliance.md`
* `workspace/knowledge/README.md`
* **Compliance policy files**
* `backend/services/policy/uk-online-marketplace.yaml`
* `backend/services/policy/config/whitelist.yaml` (allowed commands: `echo|ls|cat|head` etc.)
* **InfraNodus**
* exports in → `workspace/data/infranodus/exports/`
* processed/incoming → `workspace/data/infranodus/incoming/`
* **Scraper**
* allowlist → `backend/services/scraper/config/allowlist.yaml`
* results → `workspace/data/scrapes/*.jsonl`
---
## Plan — Bright Data + Unified Search (Next Sessions)
- Goal: broaden discovery and enrich research while staying policy‑first (TruthLens).
- Sources: Bright Data templates (Google SERP/News, eBay, Amazon, Etsy, selected media), plus APIs (Twitter/X, Reddit, Google Trends) routed through InfraNodus where appropriate.
- Tiered allowlist active at: `backend/services/policy/config/whitelist.yaml`
- **gov**: gov.uk, legislation.gov.uk, ico.org.uk, fca.org.uk, ons.gov.uk
- **marketplaces**: ebay.co.uk, amazon.co.uk, etsy.com
- **media**: bbc.co.uk, theguardian.com, ft.com
- **vendor_docs**: stripe.com, shopify.dev, vercel.com, developers.google.com
- **templates**: `bd:google/serp`, `bd:google/news`, `bd:ebay/search`, `bd:ebay/product`, `bd:amazon/search`
- **wildcards** (discovery only): `*.google.*`, `*.ebay.*`, `*.amazon.*`
- **rules**: respect robots; no login walls; no PII; **purpose tags required**; UA + gentle rate limit.
Artifacts
- Discovery → `workspace/data/search/*.jsonl`
- Fetches (BD/APIs) → `workspace/data/scrapes/*.jsonl`
- All saved with provenance `{provider, template, purpose, ts}`; indexed via `sf index`.
TruthLens checkpoints validate: provenance, content integrity, compliance, safe-commands gating, analytics sanity **before** execution.
Tasks