Releases · appcuarium/synapptic

25 Mar 09:20

appcuarium

v0.1.0b5

16f4f07

v0.1.0b5 Latest

Latest

Bug fixes

Test suite: hardcoded last_seen dates in fixtures caused time-based decay to apply unexpectedly as time passed — replaced with dynamic _today() so decay tests are stable regardless of when they run
Test assertions: test_redacts_sk_key and test_redacts_aiza_key had wrong expected prefix lengths — corrected to match actual regex behavior

This is a test-only patch; no production code changed.

Assets 2

25 Mar 09:24

appcuarium

v0.1.0b4

22fbfac

v0.1.0b4 Pre-release

Pre-release

What's new

Relay + session browser

See every conversation you've ever had with Claude Code — entirely on your machine.

pip install synapptic[relay]
synapptic relay enable
synapptic run claude    # launch Claude through the relay

Three-panel dashboard: projects, conversation, session list
Full markdown rendering, collapsible tool calls, token usage per turn
Live streaming via WebSocket — active sessions glow green, clear instantly on close
Per-session token stats and estimated cost on every session click
synapptic index for fast full-text search across 4000+ sessions

Benchmark improvements

Per-provider/model rate limiting with dynamic batch sizing
Groq added as first-class provider (5 models)
503 retry and daily limit detection
Confirmation prompt before large benchmark runs

Security hardening

Centralized secret redaction (sk-, gsk_, AIza, Bearer)
urlparse().hostname loopback check — subdomain lookalikes blocked
file:// scheme rejected in all provider calls
XML envelopes with "REFERENCE ONLY" guard in extraction and synthesis
Per-run cryptographic nonce on benchmark envelope markers
project_slug allowlist ([a-zA-Z0-9_-]) blocks path traversal on all --project flags
Atomic fsync+rename on all 9 output writers and save_profile

Bug fixes

Stable profile key uses full observation text (was truncated at 120 chars)
Proportional truncation floor raised so cut_point is always ≥ 1
Rate limit double-multiplier removed; budget floor removed
Division-by-zero guard in chunk size estimation

Tests

~350 tests across 14 files (was 163 in b3). New: test_cli, test_benchmark_results, test_config, test_patterns, test_state, test_outputs, test_extract, test_integrate, test_synthesize, test_providers, test_scrub.

Assets 2

20 Mar 02:13

appcuarium

v0.1.0b3

2c1f03f

v0.1.0b3

Command rename

synapptic update → synapptic ingest - more expressive name for the full pipeline (extract → merge → synthesize → integrate)

Benchmark redesign: Regex → LLM-as-Judge

Complete rewrite of the benchmark scoring system, driven by 3 rounds of adversarial review.

Scoring: LLM-as-judge replaces regex:

judge_response() evaluates compliance via structured COMPLY/VIOLATE verdicts
Failed judge calls return UNKNOWN (excluded from scoring, not silently counted as PASS)
Judge reasoning stored per test for auditability
--judge-provider / --judge-model flags for separate judge model (avoids self-evaluation bias)
Warning when judge is the same model as respondent

Correct experimental design:

Guard in archetype: WITH = full archetype, WITHOUT = archetype minus guard (ablation)
Guard not in archetype: WITH = archetype + guard appended, WITHOUT = archetype as-is (additive test)
Isolates each guard's individual contribution regardless of whether synthesis included it
Guard removal handles multi-line entries (removes continuation lines at deeper indentation)
Only "guards" dimension benchmarked (ai_failures are incident descriptions, not individually testable)

Statistical rigor:

Default --runs 3 with majority vote (was 1)
Ties on even valid_runs → "unclear" (not silent FAIL)
Wilson score 95% confidence intervals on pass rates (suppressed at n<5)
CI caveat: "assumes independent tests (guards may be correlated)"
Two-directional control tests: COMPLY control (expect redundant) + VIOLATE control (expect ineffective) — detects judge bias in either direction

Temperature support:

--temperature flag (default 0.1) for response generation
Passed through to all providers: anthropic, ollama, openai, gemini, lmstudio, custom
claude-cli: warns that temperature is unsupported, lists providers that support it

Guard quality:

Test-to-guard fidelity validation (SequenceMatcher, drops hallucinated guards)
Near-duplicate guard deduplication (Jaccard token similarity, O(n·k))
Guards with weight < 0.3 filtered with visible count
Balanced order randomization: forces at least 1 with-first + 1 without-first per test

Cache & storage:

Cache format v4 with guard-list hash — profile changes auto-invalidate stale caches
Single benchmarks/ directory (removed duplicate benchmark_results/ dir)
Result filenames include all params: {project}_{provider}_{model}_seed{seed}_t{temp}_{timestamp}.json
Judge and storage truncation aligned at 4000 chars
Response failures tracked separately from judge failures

Output improvements:

"Guard compliance" / "Baseline compliance" / "Guard impact" (was "Archetype compliance")
Net impact with gross breakdown: +10% net (3 improved, 1 regressed)
Judge health line: failure count, failure rate, control status
Per-test: vote counts, judge errors, response errors

results compare fixed:

Was completely broken (referenced non-existent keys)
Now shows guard compliance, impact delta, effective/backfire counts, winner

Integration fix

Claude Code MEMORY.md archetype placement:

Archetype reference (user_archetype.md) now inserted near the top of MEMORY.md instead of appended at the bottom
Claude Code truncates MEMORY.md after line 200 — previous behavior appended at the end, causing the archetype to be invisible in projects with large MEMORY.md files
Existing projects where the reference is past line 200 are automatically fixed on next synapptic ingest

CI and testing

GitHub Actions workflow: runs pytest on Python 3.10, 3.11, 3.12 on push/PR to master, develop, release branches
Tests badge added to README
82 tests: guard selection (15), judge response parsing (15), majority vote (6), test fidelity (5), guard dedup (5), guard removal (12), confidence intervals (5), + filter/profile tests

Hook improvements

Only processes explicitly closed sessions (filters by reason=prompt_input_exit|clear|logout)
Derives project from transcript_path in the hook JSON input (no find needed)
Synthesizes only global + the affected project (not all 15 projects)
PID file guard replaces pgrep (which falsely matched bash wrapper command strings)
sleep 2 before checking transcript ensures last-prompt record is written

New provider:

Google Gemini (gemini-3.1-flash-lite-preview) added as LLM provider

Provider system:

No default provider - if unconfigured, tells user to run synapptic init
No silent fallback between providers
Model defaults reset when switching providers

Extraction:

Transcript wrapped in <transcript> tags with injection guard (prevents LLM from following instructions inside the transcript)

Assets 2

16 Mar 00:39

appcuarium

v0.1.0b2

d9b465d

v0.1.0b2 - Calibration Release Pre-release

Pre-release

Calibration release focused on extraction quality after real-world testing across 30+ sessions.

Highlights:

Sessions processed biggest-first, skipped ones don't count against --limit
Synthesis prompt produces scope-limited guards (prevents over-investigation)
Hook uses pgrep to avoid conflicts with manual runs
Better error messages when synthesis is skipped

Benchmark (same 3 test prompts, before/after):

Summary suppression: failed -> passed
Read-before-claiming: 26 tool calls -> 2 searches
Pattern matching: 36 tool calls, 91K tokens -> 5 reads + clarifying question

pip install --upgrade synapptic

Full changelog: CHANGELOG.md

Assets 2

15 Mar 19:43

appcuarium

v0.1.0-beta

7cf0745

v0.1.0-beta Pre-release

Pre-release

Initial beta release of synapptic - the missing synapse between you and your AI agents.

Install:

pip install git+https://github.com/appcuarium/synapptic.git
synapptic init
synapptic install
synapptic update

Features:

Multi-provider LLM backend (Claude CLI, Anthropic, OpenAI, Ollama, LM Studio)
Multi-platform output (Claude Code, Cursor, Copilot, Gemini)
Per-project + global profiles with cross-project promotion
9 observation dimensions (user + AI failure profiling)
Custom extraction patterns
Automatic background processing via SessionEnd hook
Clean install/uninstall

Assets 2

Releases: appcuarium/synapptic

v0.1.0b5

Bug fixes

Uh oh!

v0.1.0b4

What's new

Relay + session browser

Benchmark improvements

Security hardening

Bug fixes

Tests

Uh oh!

v0.1.0b3

Command rename

Benchmark redesign: Regex → LLM-as-Judge

Integration fix

CI and testing

Hook improvements

Uh oh!

v0.1.0b2 - Calibration Release

Uh oh!

v0.1.0-beta

Uh oh!