SCCE 2.0

Sourced-Citation Cognitive Engine is a production-grade, offline-first intelligence system built for environments where trust matters more than stylistic fluency.

Executive Summary

SCCE is a local-first question-answering system designed for high-trust environments.

It answers from your ingested corpus, not from remote model calls.
It combines lexical, graph, and spectral retrieval before synthesis.
It exposes provenance as part of every answering workflow.
It is operationalized as a server plus worker with observable status and job control APIs.
It is built for teams that need auditable behavior under privacy, regulatory, or mission constraints.

Key characteristics:

Evidence is not optional. Response quality is tied to retrievable source material.
Runtime is local-first. Core answering paths do not depend on cloud LLM calls.
Provenance is product behavior, not a dashboard extra.
Operational behavior is inspectable: jobs, status, ingestion, and model state are exposed through APIs.

Why This Matters

SCCE is designed for teams that cannot outsource reasoning to opaque cloud models and cannot accept answers without traceable evidence. It ingests your corpus, builds local structure over that corpus, and answers questions through retrieval + reasoning + constrained synthesis with provenance as a first-class output.

If your use case includes regulated workflows, private data estates, air-gapped infrastructure, or high-cost decisions, SCCE is built for that reality.

What SCCE Does

SCCE combines five capabilities into one deployable system:

Corpus ingestion across mixed sources (documents, spreadsheets, code, wiki-style corpora).
Knowledge structuring via entities, relations, and spectral projections.
Multi-channel retrieval (lexical, graph, spectral) with diversity-aware fusion.
Planner-driven reasoning loop that tests and refines candidate claims.
Local synthesis with quality gates, provenance checks, and uncertainty signaling.

End-to-End Pipeline

At a high level:

Ingest files into documents/spans/chunks.
Correlate entities and relations.
Build and refresh spectral basis/projections.
Train and load local n-gram models.
Resolve queries through perception, retrieval, planning, verification, and synthesis.
Return response text plus source-linked context.

This is implemented as a stable server runtime with background jobs and API visibility for each operational phase.

Production Posture

SCCE is structured for real operations, not just demos.

Stateful service with explicit DB + model dependencies
Startup migration safety and controlled shutdown persistence
Async chat mode with SSE streaming and status events
Job queue control for indexing/training/spectral refresh
Operational endpoints for status, topology, activity, and audit export
Runbook coverage for backups, restore, incidents, and handoff

See full operating details in docs/OPERATIONS.md and docs/PRODUCTION_HANDOFF.md.

Architecture at a Glance

apps/server: Fastify API, startup/shutdown lifecycle, routes, worker orchestration
apps/web: React UI for chat, vault, training, artifacts, and system monitoring
packages/core: ingestion, correlation, retrieval, planner, synthesis, spectral logic
packages/db: PostgreSQL access and migration layer
packages/types: shared TypeScript types and contracts
packages/compute: parallel pipeline and compute dispatch utilities
packages/security: policy and audit support
packages/plugins: renderer and webapp template infrastructure
packages/sketches: probabilistic structures used by supporting workflows
data: local models, uploads, corpora, artifacts, and runtime state

Prerequisites

Node.js >= 20
pnpm >= 8 (via corepack)
PostgreSQL >= 14

Quick Start (Local)

Install dependencies.

corepack enable
corepack pnpm install

Set database URL for the server process.

$env:SCCE_DB_URL="postgres://scce_app:scce_app@localhost:5432/scce"

Build all packages.

corepack pnpm -r build

Start server and web app in separate terminals.

corepack pnpm dev:server
corepack pnpm dev:web

Verify runtime health.

curl http://127.0.0.1:3000/health
curl http://127.0.0.1:3000/api/system/status

Fast Production Bootstrap

For a full local bootstrap (DB path, demo seeding, ingest, training triggers, and validation request):

corepack pnpm tsx scripts/setup-complete-system.ts

First API Interaction

Synchronous chat (no attachments):

curl -X POST http://127.0.0.1:3000/api/chat `
	-H "Content-Type: application/json" `
	-d '{"message":"What is in the vault?","conversationId":null,"attachments":[]}'

Asynchronous chat pattern (attachments -> SSE):

POST /api/chat with attachments.
Read conversationId from response.
Stream events from GET /api/events/:conversationId.

See detailed contracts and payload shapes in docs/API_REFERENCE.md.

Core Scripts

corepack pnpm db-setup: create/apply database schema
corepack pnpm smoke-test: validate key runtime paths
corepack pnpm seed: seed demo corpus
corepack pnpm status: status script
corepack pnpm ingest:wiki: run wiki ingestion/training pipeline
corepack pnpm quality:check: headers + architecture checks
corepack pnpm quality:deep: quality checks + hostile audit suite

Security and Trust Model

SCCE trust posture is layered:

credentials are environment-supplied, not hard-coded
CORS policy is constrained to localhost development origins and rejects null origin
upload/ingest paths are validated before filesystem operations
duplicate controls reduce accidental corpus bloat and replay noise
provenance verification is part of answer quality handling

Operating SCCE in Production

Operational priorities:

keep DB and model backups current
monitor chat error and timeout rates
watch training/job queue health
track ingestion growth and duplicate trends
validate release upgrades against migration path

Use these docs as your source of truth:

Contributing and Engineering Standards

SCCE expects disciplined, auditable changes.

keep changes scoped and reversible
preserve API contracts or document intentional changes
keep SQL parameterized and input validation explicit
update docs alongside behavior changes
validate with build/smoke/quality scripts before merge

Contributor workflow references:

Documentation Index

docs/ARCHITECTURE.md: full system architecture and pipeline internals
docs/DEVELOPMENT.md: development workflow and package boundaries
docs/ONBOARDING.md: first-contribution and first-PR path
docs/OPERATIONS.md: startup, ingest, training, backup, troubleshooting
docs/PRODUCTION_HANDOFF.md: SLOs, monitoring, incidents, ownership transfer
docs/API_REFERENCE.md: endpoint reference and payload examples
docs/MATH_OVERVIEW.md: code-grounded equations, scoring functions, and thresholds
docs/AI_SKILLS.md: repository-specific assistant guidance and guardrails

License

Proprietary. See LICENSE for terms.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
apps		apps
docs		docs
packages		packages
scripts		scripts
.gitignore		.gitignore
.npmrc		.npmrc
LICENSE		LICENSE
LICENSE.md		LICENSE.md
README.md		README.md
package.json		package.json
pnpm-workspace.yaml		pnpm-workspace.yaml
start-server.ps1		start-server.ps1
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCCE 2.0

Executive Summary

Why This Matters

What SCCE Does

End-to-End Pipeline

Production Posture

Architecture at a Glance

Prerequisites

Quick Start (Local)

Fast Production Bootstrap

First API Interaction

Core Scripts

Security and Trust Model

Operating SCCE in Production

Contributing and Engineering Standards

Documentation Index

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SCCE 2.0

Executive Summary

Why This Matters

What SCCE Does

End-to-End Pipeline

Production Posture

Architecture at a Glance

Prerequisites

Quick Start (Local)

Fast Production Bootstrap

First API Interaction

Core Scripts

Security and Trust Model

Operating SCCE in Production

Contributing and Engineering Standards

Documentation Index

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages