Skip to content

Releases: appcuarium/synapptic

v0.1.0b5

25 Mar 09:20

Choose a tag to compare

Bug fixes

  • Test suite: hardcoded last_seen dates in fixtures caused time-based decay to apply unexpectedly as time passed — replaced with dynamic _today() so decay tests are stable regardless of when they run
  • Test assertions: test_redacts_sk_key and test_redacts_aiza_key had wrong expected prefix lengths — corrected to match actual regex behavior

This is a test-only patch; no production code changed.

v0.1.0b4

25 Mar 09:24

Choose a tag to compare

v0.1.0b4 Pre-release
Pre-release

What's new

Relay + session browser

See every conversation you've ever had with Claude Code — entirely on your machine.

pip install synapptic[relay]
synapptic relay enable
synapptic run claude    # launch Claude through the relay
  • Three-panel dashboard: projects, conversation, session list
  • Full markdown rendering, collapsible tool calls, token usage per turn
  • Live streaming via WebSocket — active sessions glow green, clear instantly on close
  • Per-session token stats and estimated cost on every session click
  • synapptic index for fast full-text search across 4000+ sessions

Benchmark improvements

  • Per-provider/model rate limiting with dynamic batch sizing
  • Groq added as first-class provider (5 models)
  • 503 retry and daily limit detection
  • Confirmation prompt before large benchmark runs

Security hardening

  • Centralized secret redaction (sk-, gsk_, AIza, Bearer)
  • urlparse().hostname loopback check — subdomain lookalikes blocked
  • file:// scheme rejected in all provider calls
  • XML envelopes with "REFERENCE ONLY" guard in extraction and synthesis
  • Per-run cryptographic nonce on benchmark envelope markers
  • project_slug allowlist ([a-zA-Z0-9_-]) blocks path traversal on all --project flags
  • Atomic fsync+rename on all 9 output writers and save_profile

Bug fixes

  • Stable profile key uses full observation text (was truncated at 120 chars)
  • Proportional truncation floor raised so cut_point is always ≥ 1
  • Rate limit double-multiplier removed; budget floor removed
  • Division-by-zero guard in chunk size estimation

Tests

~350 tests across 14 files (was 163 in b3). New: test_cli, test_benchmark_results, test_config, test_patterns, test_state, test_outputs, test_extract, test_integrate, test_synthesize, test_providers, test_scrub.

v0.1.0b3

20 Mar 02:13

Choose a tag to compare

Command rename

  • synapptic updatesynapptic ingest - more expressive name for the full pipeline (extract → merge → synthesize → integrate)

Benchmark redesign: Regex → LLM-as-Judge

Complete rewrite of the benchmark scoring system, driven by 3 rounds of adversarial review.

Scoring: LLM-as-judge replaces regex:

  • judge_response() evaluates compliance via structured COMPLY/VIOLATE verdicts
  • Failed judge calls return UNKNOWN (excluded from scoring, not silently counted as PASS)
  • Judge reasoning stored per test for auditability
  • --judge-provider / --judge-model flags for separate judge model (avoids self-evaluation bias)
  • Warning when judge is the same model as respondent

Correct experimental design:

  • Guard in archetype: WITH = full archetype, WITHOUT = archetype minus guard (ablation)
  • Guard not in archetype: WITH = archetype + guard appended, WITHOUT = archetype as-is (additive test)
  • Isolates each guard's individual contribution regardless of whether synthesis included it
  • Guard removal handles multi-line entries (removes continuation lines at deeper indentation)
  • Only "guards" dimension benchmarked (ai_failures are incident descriptions, not individually testable)

Statistical rigor:

  • Default --runs 3 with majority vote (was 1)
  • Ties on even valid_runs → "unclear" (not silent FAIL)
  • Wilson score 95% confidence intervals on pass rates (suppressed at n<5)
  • CI caveat: "assumes independent tests (guards may be correlated)"
  • Two-directional control tests: COMPLY control (expect redundant) + VIOLATE control (expect ineffective) — detects judge bias in either direction

Temperature support:

  • --temperature flag (default 0.1) for response generation
  • Passed through to all providers: anthropic, ollama, openai, gemini, lmstudio, custom
  • claude-cli: warns that temperature is unsupported, lists providers that support it

Guard quality:

  • Test-to-guard fidelity validation (SequenceMatcher, drops hallucinated guards)
  • Near-duplicate guard deduplication (Jaccard token similarity, O(n·k))
  • Guards with weight < 0.3 filtered with visible count
  • Balanced order randomization: forces at least 1 with-first + 1 without-first per test

Cache & storage:

  • Cache format v4 with guard-list hash — profile changes auto-invalidate stale caches
  • Single benchmarks/ directory (removed duplicate benchmark_results/ dir)
  • Result filenames include all params: {project}_{provider}_{model}_seed{seed}_t{temp}_{timestamp}.json
  • Judge and storage truncation aligned at 4000 chars
  • Response failures tracked separately from judge failures

Output improvements:

  • "Guard compliance" / "Baseline compliance" / "Guard impact" (was "Archetype compliance")
  • Net impact with gross breakdown: +10% net (3 improved, 1 regressed)
  • Judge health line: failure count, failure rate, control status
  • Per-test: vote counts, judge errors, response errors

results compare fixed:

  • Was completely broken (referenced non-existent keys)
  • Now shows guard compliance, impact delta, effective/backfire counts, winner

Integration fix

Claude Code MEMORY.md archetype placement:

  • Archetype reference (user_archetype.md) now inserted near the top of MEMORY.md instead of appended at the bottom
  • Claude Code truncates MEMORY.md after line 200 — previous behavior appended at the end, causing the archetype to be invisible in projects with large MEMORY.md files
  • Existing projects where the reference is past line 200 are automatically fixed on next synapptic ingest

CI and testing

  • GitHub Actions workflow: runs pytest on Python 3.10, 3.11, 3.12 on push/PR to master, develop, release branches
  • Tests badge added to README
  • 82 tests: guard selection (15), judge response parsing (15), majority vote (6), test fidelity (5), guard dedup (5), guard removal (12), confidence intervals (5), + filter/profile tests

Hook improvements

  • Only processes explicitly closed sessions (filters by reason=prompt_input_exit|clear|logout)
  • Derives project from transcript_path in the hook JSON input (no find needed)
  • Synthesizes only global + the affected project (not all 15 projects)
  • PID file guard replaces pgrep (which falsely matched bash wrapper command strings)
  • sleep 2 before checking transcript ensures last-prompt record is written

New provider:

  • Google Gemini (gemini-3.1-flash-lite-preview) added as LLM provider

Provider system:

  • No default provider - if unconfigured, tells user to run synapptic init
  • No silent fallback between providers
  • Model defaults reset when switching providers

Extraction:

  • Transcript wrapped in <transcript> tags with injection guard (prevents LLM from following instructions inside the transcript)

v0.1.0b2 - Calibration Release

16 Mar 00:39

Choose a tag to compare

Pre-release

Calibration release focused on extraction quality after real-world testing across 30+ sessions.

Highlights:

  • Sessions processed biggest-first, skipped ones don't count against --limit
  • Synthesis prompt produces scope-limited guards (prevents over-investigation)
  • Hook uses pgrep to avoid conflicts with manual runs
  • Better error messages when synthesis is skipped

Benchmark (same 3 test prompts, before/after):

  • Summary suppression: failed -> passed
  • Read-before-claiming: 26 tool calls -> 2 searches
  • Pattern matching: 36 tool calls, 91K tokens -> 5 reads + clarifying question
pip install --upgrade synapptic

Full changelog: CHANGELOG.md

v0.1.0-beta

15 Mar 19:43

Choose a tag to compare

v0.1.0-beta Pre-release
Pre-release

Initial beta release of synapptic - the missing synapse between you and your AI agents.

Install:

pip install git+https://github.com/appcuarium/synapptic.git
synapptic init
synapptic install
synapptic update

Features:

  • Multi-provider LLM backend (Claude CLI, Anthropic, OpenAI, Ollama, LM Studio)
  • Multi-platform output (Claude Code, Cursor, Copilot, Gemini)
  • Per-project + global profiles with cross-project promotion
  • 9 observation dimensions (user + AI failure profiling)
  • Custom extraction patterns
  • Automatic background processing via SessionEnd hook
  • Clean install/uninstall