Restructure: Trust → Structure → Depth → Polish → CI → Portfolio (v1.4.0) by sohan-shingade · Pull Request #33 · sohan-shingade/flint

sohan-shingade · 2026-04-25T19:13:39Z

Restructure: Trust → Structure → Depth → Polish → CI → Portfolio

Closes the entire 6-phase restructure plan, the bulk of the deferred
backlog, and all 5 D-6.4-replay slices end-to-end. 41 commits this
session, +8000 / -2400 LOC, 1861 → 2070 tests, ruff hard-fail clean,
vite build clean, 30/30 cargo tests, 91/91 across the full replay
surface (event log + replay primitive + snapshots + engine writer
hooks + REST + MCP + auto-compaction + E2E parity + Rust ledger
parity).

Bumps version 1.3.1 → 1.4.0.

Highlights by phase

Phase 1 — Trust & correctness (shipped)

Phase 1 work landed earlier in the session as commits d7e80e5 →
35cbf24. Recap: parity reports, PIT audit, custom data ingest,
sandbox subprocess isolation, 26 PIT_METADATA blocks, force-close
correctness fix in Rust + Python, cross-market terminals.

Phase 2 — Structural cleanup (shipped, full close)

D-2.1.b: BacktestContext god class → 7 manager classes

Every piece of mutable state now lives in one of seven dedicated
owners under flint/execution/:

Manager	Owns
`PositionManager`	open + closed-trade dicts
`CashManager`	cash, allocator, fees / tx / funding counters
`FillRecorder`	recorded fills + diagnostic log
`OrderQueue`	pending limit/stop/TP queue + this-bar market queue
`FundingLedger`	per-market + per-venue funding history
`BorrowLedger`	Jupiter borrow rates + paid-borrow ledger
`MarketDataFeed`	cross-market candles + orderbook + OI

Caller sites all migrated: every _apply_fill, apply_funding,
check_liquidations, process_pending_orders, process_market_orders,
close_all_positions, set_candle, account, positions,
pending_orders body now routes through the right manager. Legacy
property aliases retained for tests; new code uses managers directly.

D-4.7-full: flint/services/{strategies,backtest,journal,data,paper}.py
pulls work-doing code out of FastAPI routes into a service layer that
MCP, scripts, and notebooks call directly. Strategy-template registry
has one source of truth (was duplicated across 3 places).

D-2.2-internal: every store mutation routes through _sql_*
wrappers that hold the lock; routes never touch store._conn directly.

Phase 3 — Depth on wedge (shipped)

D-3.4-rust + D-3.1-rust: Rust ports of TxCostModel and
OrderbookFiller (PyO3, 2.24× and 3.52× speedups, 1e-9 parity tests).

D-3.3-maker-detection: FillResult.is_maker flag wires through
the Rust fill pipeline; resting-limit fills tag maker; Drift
(-2 bps rebate) and Hyperliquid (1 bp) maker rates verified through
end-to-end PyO3 tests.

D-3.5-orchestrator: flint/risk/portfolio_orchestrator.py:PortfolioMarginEngine
composes MarginEngine + VenueAllocator + PortfolioRiskEngine into one
pre-trade check facade. BacktestContext.market_order consults the
orchestrator; rejection comes back tagged MARGIN/ALLOCATOR/PORTFOLIO
so the warn-line names which engine vetoed.

Phase 4 — Product polish (shipped)

D-4.3-websocket end-to-end:

Per-session routes /ws/paper/{id} and /ws/live/{id}
ConnectionManager with monotonic per-channel seq + 500-deep replay
ring buffer (?since=<seq> opt-in) + ping(channel) heartbeat
useWebSocket<T> hook with 1→2→5→10→30s reconnect backoff +
30s heartbeat-stale detection
PaperTradingEngine emits {type: tick} per bar and {type: trade}
per closed trade
LiveExecutionContext emits {type: fill} from _handle_fill
(fire-and-forget via ensure_future)
PaperTrading.tsx + LiveMonitor.tsx pages bound: live equity / trades /
fills overlay polled state, with WS LIVE / CONNECTING / OFFLINE
indicator dot

D-4.2-backoff-full: useBackoffPoll<T> + 3-hook migration shipped.

D-1.4-ui: paper reconciliation upload (multipart CSV) + UI panel.

Phase 5 — CI (shipped)

D-5.1-ruff: ruff configured to F-class only (real bugs); 315
auto-fixes + 26 manual; CI flipped from soft to hard fail.

Phase 6 — Portfolio (foundations shipped)

D-6.1-unified: flint/portfolio/shared_engine.py:SharedCapitalPortfolioEngine
runs N strategies on one shared BacktestContext, so cash, fees,
funding, borrow, and the orchestrator's pre-trade margin gauntlet
all see the actual book. Per-strategy _TaggedContextProxy tags
order_ids with strategy_name:; closed-trade exit_order_id lets PnL
flow back to the actual closer (was even-split-by-market in the
foundation slice).

D-6.4-replay (closed end-to-end, 5/5 slices):

portfolio_events(session_id, seq, ts, kind, payload) table +
EventLogWriter (thread-safe, monotonic per-session seq)
BookState + fold(events, initial_capital, seed=) + replay()
primitive
portfolio_snapshots + SnapshotStore for compaction; replay
fast-forwards via latest_before(target_ts) → read_after_seq →
fold(seed=snapshot)
BacktestContext._emit(kind, payload) writer hooks: zero overhead
when event_log_writer + event_session_id not set; otherwise emits
on every order submit/cancel, fill, funding, liquidation, borrow
REST: GET /api/v1/replay/{id}/{events,state,summary}
MCP: replay_summary, replay_state, list_replay_events
UI: /replay page with session loader, real timeline slider
(range bounded to first/last event ts), step controls
(← PREV / NEXT → / ⏮ START / END ⏭), state cards, positions
table, color-coded event-tail panel (50 most recent folded events)
Auto-compaction: BacktestContext's snapshot_every ctor kwarg
(default 10_000) drives _emit to fold + persist a fresh
BookState every N events. Default disabled (no overhead unless
caller wires a SnapshotStore).
Rust ledger ports: flint_core.FundingLedger + flint_core.BorrowLedger
with PyO3 bindings (add/latest/recent/by_venue,
record/record_payment/add_paid/cumulative_at). 7 cargo +
7 Python↔Rust parity tests pinned to 1e-9.

Load-bearing parity tests:

tests/test_event_log_engine_hooks.py::TestEndToEndReplayParity —
replay over the live-emitted log reproduces BacktestContext.account.cash
byte-for-byte.
tests/test_replay_e2e_backtest.py — same parity over a real
MACrossoverStrategy run with auto-compaction enabled.
tests/test_auto_compaction.py::TestSnapshotPreservesReplayCorrectness —
snapshot fast-forward replay never produces a divergent state.

Test sweep

2070 passed · 7 skipped · 0 failed

(Skipped suites are missing optional deps — ccxt, eth_account,
solders — none are code regressions.)

UI: 133 vitest · vite build clean. Rust: 30 cargo tests.

Files reorganized

New modules under flint/:

execution/{position_manager,cash_manager,fill_recorder,order_queue,funding_ledger,borrow_ledger,market_data_feed}.py
services/{strategies,backtest,journal,data,paper}.py
risk/portfolio_orchestrator.py
portfolio/{shared_engine,event_log,replay,snapshots}.py
api/routes/replay.py
3 new MCP tools

New UI pages + hooks:

ui/src/pages/Replay.tsx
ui/src/hooks/{useWebSocket,useReplay}.ts

New Rust modules (PyO3-exposed):

rust/src/engine/{tx_costs,orderbook_fill,funding_ledger,borrow_ledger}.rs
PyO3 classes: flint_core.TxCostModel, flint_core.OrderbookFiller,
flint_core.FundingLedger, flint_core.BorrowLedger
supports_tx_costs, supports_orderbook_walk,
supports_maker_taker_fees capability flags flipped to true

Migration notes for users

pip install -U flint-trading (1.4.0)
No breaking API changes: every old method still works through the
legacy property aliases. New code should read state via
ctx._pm.values(), ctx._cm.cash, ctx.account etc. directly.
New event_log_writer + event_session_id ctor kwargs on
BacktestContext are opt-in; without them, behavior is identical
to 1.3.1.
New portfolio_risk ctor kwarg routes book-level checks through
the new PortfolioMarginEngine.

Follow-on work

D-2.1.c (live context merge) — needs testnet secrets
D-2.1.d (paper context split) — needs deliberate API design pass
D-6.5-api (live deploy two-step) — needs testnet secrets
D-6.6-proof (funding-arb proof notebook) — needs D-6.5-api
D-6.7-jito (real Jito bundle integration) — needs D-6.5-api

WAVE_STATUS.md tracks per-item state; ROADMAP.md tracks phase-level.

Full multi-phase restructure plan. 6 phase specs under docs/specs/ with exit criteria + dependency graph + task breakdowns, rooted in the 2026-04-23 audit findings. - IMPLEMENTATION_PLAN.md — master plan, sequencing, quick wins, rules of engagement - ROADMAP.md — rewritten as short index pointing to plan + phase specs - TRUST_ARTIFACTS.md — live status board for Phase 1 items - DEFERRED.md — sibling-PR tracker with owners, prerequisites, effort estimates - docs/specs/phase-1-trust-correctness.md - docs/specs/phase-2-structural-cleanup.md - docs/specs/phase-3-depth-on-wedge.md - docs/specs/phase-4-product-polish.md - docs/specs/phase-5-ci-testing.md - docs/specs/phase-6-portfolio-cross-venue.md Wedge: "best local backtester + paper-trading lab for Drift + Hyperliquid perp strategies." Phase ordering is load-bearing: trust → structure → depth → portfolio → live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…pts) Full docs reorganization following https://diataxis.fr/: - docs/tutorials/ — linear walkthroughs (01-06) - docs/how-to/ — task-oriented recipes - docs/reference/ — exhaustive catalogs (REST API, CLI, SDK, metrics, etc) - docs/concepts/ — explanation-oriented (architecture, fill pipeline, risk model, regimes, margin/capital, backtests-vs-reality) - docs/README.md — unified doc index Existing guides trimmed + cross-linked to new structure. MEV content moved out of product surface into docs/mev-*.md as research artifacts. - scripts/build_docs.py extended to scan new sections + regenerate UI docs content - ui/src/data/docs-content.ts regenerated from updated markdown sources Rename: "4-tier fill pipeline" → "3-stage pipeline with 4 impact models" (latency → impact → partial; impact stage has 4 models). Misnomer flagged in the 2026-04-23 audit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fixes every Rust/Python parity bug flagged in the 2026-04-23 audit + ships scaffolding for proof artifacts firms need before buying features. Engine + parity (T1.1): - Force-close equity: append terminal point instead of overwriting last-bar mark-to-market (both Python backtest/engine.py AND Rust runner.rs) - Cross-market cursor: strict `<` instead of `<=` so cross-market history never includes same-ts bars (simultaneity assumption now explicit) - LatencyStage RNG: deterministic default seed (was system-time); seed=-1 opts into unseeded - MonteCarlo seed: parameter threaded through run_monte_carlo(seed=) - Rust VenueFiller RNG: seed threaded from Python (was hardcoded 42), per-venue offset so per-venue RNGs are independent but reproducible - BacktestEngine(seed=): `_resolve_seed` derives deterministic-but- strategy-local seed when None tx_cost wiring (T1.1.b): - FillPipeline w/ tx_cost_model gates Rust off → Python path populates BacktestResult.total_tx_costs correctly - Fixes test_tx_cost_deducted regression (ROADMAP §1.6 blocker) Note: engine.py diff spans multiple phases (T1.1 base + T3.2 rust_required + T3.2 fallback_reason tagging + T4.5 cancel_check). Kept in one commit because non-interactive staging can't slice a file across phases. PIT audit (T1.3): - scripts/audit_pit.py scans flint/providers/* for PIT_METADATA - 3 flagship providers declared (drift_candles, hyperliquid_candles, funding_rates); remaining 22 tracked as D-1.3-providers - artifacts/pit/initial-scan.md: first report Determinism (T1.1-parity / T1.3.c): - tests/test_rust_python_parity.py (5 tests) - tests/test_determinism.py (5 tests) Custom data (T1.6): - flint/providers/custom.py: CustomCSVProvider + CustomParquetProvider - SHA-256 source_hash provenance; strict OHLCV / monotonic / resolution validation; custom:* namespace enforcement - docs/reference/custom-data-schema.md canonical schema - tests/test_custom_provider.py (12 tests) Parity report pipeline (T1.2): - scripts/run_parity_report.py catalogs 6 strategies, emits markdown artifact with 5-metric gate (PnL divergence, fill MAE, timing match, trade count, equity correlation), exits non-zero on breach Reconciliation tooling (T1.4): - scripts/reconcile_fills.py matches engine fills vs venue CSV export - Nearest-ts within window on (market, venue, side, size) - Stats: price-bps p50/p95/p99 + ts-delta p50/p95 + orphan rate - tests/test_reconcile_fills.py (14 tests) Proof notebooks (T1.5): - notebooks/{funding_arb,basis_trade,momentum_breakout}.py (jupytext) - Each pins candle sha256, runs backtest + parity, emits artifact, CI-gated exit code - notebooks/README.md CLI cosmetics (T1.1.f): - cli.py: "Candles" label → "Equity Points" (accurate after force-close changes the curve length relative to candle count) Test asserts updated for force-close terminal point: test_backtest.py, test_multi_market.py, test_example_strategies_v2.py, test_pyth_backtest_integration.py, test_latency.py (default-seed determinism replaces stale unseeded-varies test). TRUST_ARTIFACTS.md updated: 5/7 🟢 shipped, 2/7 🟡 partial (CI gate for parity → Phase 5.3; reconciliation API → Phase 4). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…onfig Closes out CLAUDE.md rule violations flagged in audit + ships user-strategy isolation + unified config entry point. Invasive ExecutionContext breakup (D-2.1.b/c/d) deferred to sibling PRs — contract locked in by tests/test_context_portability.py. Store encapsulation (T2.2): - 10 new FlintStore methods — list_live_sessions, list_markets_with_data, list_venues_for_market, delete_market_data, count_funding_rates, count_orderbook_snapshots, count_open_interest, count_venue_candles, mark_running_live_sessions_interrupted (+ existing get_live_equity_history) - Every method wraps execute in `with self._lock:` (CLAUDE.md rule) - api/routes/live.py, api/routes/data.py, api/main.py lifespan, mcp_server.py migrated — grep 'store\._conn\|store\._lock' in API surface now returns zero hits - tests/test_store_encapsulation.py lint enforces the rule going forward - tests/test_mcp_server.py mock updated for new store API User-strategy sandbox (T2.4): - flint/strategy/sandbox.py — multiprocessing.Process with `spawn` start method; RLIMIT_AS (best-effort — Linux enforced, macOS advisory); configurable wall-clock timeout with SIGTERM then SIGKILL escalation - Typed exceptions: StrategyTimeoutError, StrategyMemoryError, StrategyExecutionError - tests/test_sandbox_escape.py — 12 hostile payloads (os.system, subprocess, open, eval, exec, __import__, socket, nested import, .unlink, infinite loop, memory bomb, no-Strategy-class) - Route wiring into /api/v1/backtest/run for user-uploaded strategies deferred to D-2.4.b (Phase 4 UI work) Unified config (T2.3): - flint/backtest/config.py — BacktestConfig dataclass + nested FillConfig / MarginConfig / AllocatorConfig / VenueConfig - Stable to_json_str + sha256 checksum() — load-bearing for proof notebook provenance - from_dict / from_yaml / from_legacy_kwargs (one-release deprecation window for existing kwargs) - BacktestEngine.from_config(cfg, strategy, **overrides) classmethod ExecutionContext conformance (T2.1.a/e): - tests/test_context_portability.py walks the subclass tree, enforces every concrete ExecutionContext implements every abstract method, checks signature alignment on market_order - God-class breakup (T2.1.b: BacktestContext → PositionManager / OrderQueue / FundingTracker / BorrowTracker / MarketDataFeed), live- context merge (T2.1.c), PaperContext separation (T2.1.d) = sibling PRs D-2.1.b/c/d Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tion Executes docs/specs/execution-upgrade-v0.3.md Python-side. Rust ports of orderbook fills + partial+latency stage + multi-venue orchestrator are sibling PRs (D-3.1-rust / D-3.4-rust / D-3.5-orchestrator) — tracked in DEFERRED.md with prereqs. Rust capability matrix (T3.2): - Rust `capabilities()` PyO3 function exposes fill_models, fee_models, supports_{partial_fills,latency_stage,tx_costs,orderbook_walk, cross_market,multi_venue_margin,borrow_snapshots,maker_taker_fees}, engine + engine_version - flint/backtest/rust_capabilities.py — Python discovery helper with @lru_cache, safe stub when flint_core missing, python_capabilities() superset - BacktestResult.engine_used + fallback_reason fields (flint/models.py) - BacktestEngine(rust_required=True) + RustRequiredError — hard errors instead of silent fallback - Named fallback_reasons list — every Rust-gate says exactly why - tests/test_rust_capabilities.py (8 tests) Maker/taker fee trait (T3.3): - Rust FeeModel::{MakerTaker, Drift, Hyperliquid} variants + compute_fee_with_role(is_maker) — Drift 10bps taker / -2bps maker, HL 3.5bps taker / 1bp maker. 5 cargo tests in rust/src/engine/fees.rs - Python MakerTakerFeeModel + HyperliquidFeeModel (flint/execution/ fee_models.py) - tests/test_fee_model_parity.py (7 tests, 1e-9 tolerance) - Maker-vs-taker detection at fill time in Rust pipeline = D-3.3-maker- detection (needs D-3.4-rust fill pipeline first) Orderbook-walk fills Python hardening (T3.1): - OrderbookFillModel: reject_on_insufficient_depth=True default rejects size > aggregate book depth (was silent underfill); per-fill impact_bps attribution = (vwap − mid) / mid × 10_000 signed positive = fill worse than mid for taker - _book_mid helper; fallback preserved when no book or market mismatch - tests/test_orderbook_fill.py (9 tests) - Rust port = D-3.1-rust Slippage calibration reports (T3.6): - scripts/calibrate.py — fits power-law + sqrt impact via 5-fold CV; picks best model by CV R²; drift detection vs stored impact_coefficient (15% threshold); emits artifacts/calibration/{market}-{venue}-{date}.md - --write-yaml round-trips coefficient into flint.yaml, gated by --force when drift exceeds threshold - tests/test_calibrate_script.py (11 tests) Multi-venue margin primitives (T3.5): - tests/test_multi_venue_margin_integration.py (6 tests) validates existing VenueAllocator cross-venue transfer latency, multi-transfer- in-flight, PnL attribution, MarginEngine per-venue MMR, fragmentation metrics - Unified PortfolioMarginEngine facade in BacktestContext = D-3.5- orchestrator (blocked on D-2.1.b god-class breakup) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Phase 4 — honest positioning, cancel wiring, UI error surface, capability route. Full WebSocket + MCP in-process = sibling PRs. - /api/v1/capabilities + /api/v1/system/capabilities alias (T4.6) — reports version, api_version, engine (rust_available + rust_capabilities dict from Phase 3), features bool flags (UI hides surface when false), limits (max_concurrent_backtests, backtest_timeout_s). 5 tests. - Backtest cancellation engine-side (T4.5) — BacktestCancelled exception; BacktestEngine(cancel_check=Callable[[], bool]) polled every 100 bars alongside timeout check; Rust gate added to fallback_reasons; worker thread in /api/v1/backtest/run passes closure over _entries[run_id] .status == "cancelled"; BacktestCancelled except branch frees slot. 4 tests. UI wiring = D-4.5-ui sibling PR. - Lazy-load Monaco (T4.4) — React.lazy wraps @monaco-editor/react with styled Suspense fallback; Dashboard + non-editor pages don't pay ~1MB bundle cost. - ConnectionBanner (T4.2) — polls /api/v1/health every 10s, degraded after 1 failure, offline after 3, Retry button. Mounted in App.tsx. Root status-probe silent catch replaced with console.warn. Remaining 9 silent-catch sites + per-hook backoff = D-4.2-backoff. - README auto-counts (T4.1) — scripts/update_readme_counts.py injects live counts between  markers; --check for CI drift gate. "4-tier fill pipeline" renamed to "3-stage pipeline with 4 impact models" across README. - Full editorial wedge rewrite (split comparison tables, refocus hero, notebooks-not-examples in "Try It") = D-4.1-wedge. - MCP in-process service layer = D-4.7-mcp-inprocess. - WebSocket paper/live streams = D-4.3-websocket. Phase 5 — matrix CI, Rust job, parity + smoke workflows. .github/workflows/ci.yml — full rewrite: - test matrix: ubuntu × macos × py3.10/3.11/3.12 (6 jobs) - rust matrix: ubuntu + macos; maturin develop + cargo test --release + Rust/Python parity tests (parity, engine, caps, determinism, fees) - sandbox matrix: ubuntu + macos; tests/test_sandbox_escape.py with 120s timeout (memory-bomb test skips on macOS per platform limits) - counts job: scripts/update_readme_counts.py --check gate - lint job: ruff check + format (soft-fail pass one, tightens under D-5.1-ruff-fixes) + import sanity - ui-build preserved - Dropped `|| pip install -e .` silent fallback; pytest-timeout=60 hard cap - ~/13 parallel jobs per push (was 3) .github/workflows/parity.yml (T5.3) — workflow_dispatch + weekly cron Mon 06:00 UTC. Runs scripts/run_parity_report.py, uploads markdown artifact, fails on threshold breach. Accepts strategy/market/lookback_days inputs. .github/workflows/live-smoke.yml (T5.6) — workflow_dispatch only (never on push). venue chooser {drift, hyperliquid, both}. Runs tests/ integration/test_live_smoke.py gated on FLINT_LIVE_SMOKE=1 + DRIFT_DEVNET_KEYPAIR / HL_TESTNET_KEY secrets. Guard rails in test: max 0.01 SOL notional, refuse wallet balance > $20, auto-cancel in 1s, testnet/devnet only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…slocation Python-side Phase 6 scaffolding. Shared-capital orchestration + live deploy API + Jito integration + replay = sibling PRs (D-6.1-unified, D-6.5-api, D-6.7-jito, D-6.4-replay) — gated on Phase 2-3 sibling prereqs. PortfolioRiskEngine (T6.2): - flint/risk/portfolio.py — RiskLimits dataclass (gross / net exposure, per-venue / per-market concentration, correlation-cluster cap, drawdown kill-switch, 95%/99% historical-sim VaR) - OrderLite / PositionLite transport types (tests don't need full BacktestContext plumbing) - check_order → RiskCheckResult(approved, reason) with first-failing check named - check_kill_switch(equity) tracks peak + triggers on drawdown - Correlation clusters via union-find over user-supplied corr matrix - tests/test_portfolio_risk.py (13 tests) Correlation-aware Optuna objective (T6.3): - flint/optimization/portfolio_objective.py — pairwise_correlations(equity_curves) returns mean pairwise correlation of per-strategy per-bar returns (handles zero-variance curves) - portfolio_objective(trial, strategies_fn, runner, penalty_lambda) — maximizes portfolio_sharpe − penalty_lambda × max(0, corr − floor) - score(...) standalone helper for post-hoc portfolio ranking - tests/test_portfolio_objective.py (9 tests) Funding Dislocation Arb reference (T6.6): - flint/strategy/funding_dislocation_arb.py — evolution of FundingArbStrategy with z-score entry filter (spread ≥ N stddevs above trailing mean), Kelly-lite position sizing, mean-reversion exit - parameters() surface ready for Optuna - Proof notebook + mainnet checklist = D-6.6-proof (blocked on D-1.4-api reconciliation UI + D-6.5-api live deploy) - tests/test_funding_dislocation_arb.py (5 tests) Multi-strategy portfolio scaffold (T6.1): - tests/test_portfolio_engine_multi_strategy.py (5 tests) validates existing flint.portfolio.engine.PortfolioEngine end-to-end - Shared-capital pool (one strategy's loss drains another's budget) + pre-trade PortfolioRiskEngine gate on every order = D-6.1-unified (blocked on D-2.1.b BacktestContext breakup) Final sweep: 1854 passed / 7 skipped / 0 failures across all phases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Original sprint plan for the Phase 3 "depth on wedge" work. Tracks the four features transforming the engine from single-venue candle-based simulation to multi-venue orderbook-aware margin-tracked execution. Phase 3 commit (c11acad) implements the Python-side of items 1, 3 (primitives), and the calibration support surface. Rust-side items and unified orchestration are tracked in DEFERRED.md (D-3.1-rust, D-3.4-rust, D-3.5-orchestrator). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Full end-to-end smoke test of branch restructure surfaced 4 bugs; all four fixed + captured by regression tests in tests/test_smoke_regressions.py. See BUG_REPORT_2026-04-24.md for full writeup with symptoms + root causes + red evidence. BUG-1 — /api/v1/live/sessions DuckDB Binder Error flint/store.py:list_live_sessions selected non-existent column 'strategy' — actual column is 'strategy_name'. Renamed in SELECT list; Python dict key unchanged. Introduced by Phase 2 T2.2 (commit 46c7a15). BUG-2 — engine_used + fallback_reason dropped from API response flint/api/routes/backtest.py:~687 built the response from tearsheet.to_dict() without the telemetry fields added in Phase 3 T3.2. Added two explicit lines after data_quality block. BUG-3 — Four-way version mismatch (pyproject 1.3.1 / API 0.1.0 / UI 0.3.0) flint/api/routes/system.py:_get_version queried distribution 'flint' (wrong name — pyproject says 'flint-trading') and found stale phantom egg-info at 0.1.0/0.2.0. New preference order: pyproject.toml → flint-trading → flint → 0.0.0 fallback. ui/src/App.tsx footer was hardcoded 'FLINT v0.3.0'; now fetches from /api/v1/capabilities on mount and falls back to '?.?.?' on probe fail. BUG-4 — Monaco editor pulled from cdn.jsdelivr.net Violated README + home-page "local-first, nothing leaves your machine" promise. @monaco-editor/react default loader fetched 14 files from jsdelivr on every /backtest page load. ui/src/components/CodeEditor.tsx now imports monaco-editor + calls loader.config({ monaco }) so Vite bundles Monaco into the local JS output. Verified via Playwright: cdnCount dropped from ~14 to 0. Regression tests (tests/test_smoke_regressions.py, 4 classes): - TestLiveSessionsColumnNameFix - TestEngineUsedTelemetryInAPIResponse - TestVersionConsistency - TestMonacoLoadsLocallyNotFromCDN Each class docstring carries pre-fix symptom + root cause + fix pointer. Red confirmed by reverting each fix individually (for BUG-1/2/3) and seeing the documented error surface; BUG-4 verified via live browser network trace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the fast-lane DEFERRED.md items. Scope matrix + remaining items in DEFERRED.md (rewritten with closed/open split). Phase 1 tail — all shipped: D-1.1.b Rust close_all() takes FeeModel, charges exit fill (was fee=0.0 — Rust/Python force-close divergence closed) D-1.2-CI parity.yml already shipped in Phase 5.3 commit, verified D-1.3-providers 22 providers declared PIT_METADATA via batch script; scripts/audit_pit.py reports 26/26 ✓ D-1.4-api GET /api/v1/paper/{session_id}/reconciliation returns engine-side fill summary (UI panel + POST variant = D-1.4-ui sibling) D-1.5-data-pins docs/how-to/pin-notebook-fixtures.md explains lightweight (hash-only) vs heavy (parquet-commit) fixture workflows D-1.6-byo-fills CustomCSVProvider(table="fills") parses user fill logs with shared schema; CustomDataImport.fills returns records for reconcile + calibrate paths Phase 2 tail — partial: D-2.2-internal flint/journal/storage.py + flint/paper/session_store.py migrated off raw _store._conn / _store._lock via new FlintStore._sql_{exec,read_all,read_one,...} wrappers; grep now finds only doc comments D-2.4.b /api/v1/backtest/run routes user code (req.code / user:*) through flint.strategy.sandbox.run_strategy_in_sandbox in sandbox-compatible configs; multi-market/margin/orderbook configs fall back to in-process with log line Phase 4 tail — all shipped: D-4.1-wedge README hero rewritten; comparison table split into "vs DeFi-native tools" + "vs general crypto bots"; examples/ → notebooks/ in Try It; CEX live honestly "Planned" D-4.2-backoff 18 .catch(() => {}) sites across 10 UI files replaced with structured console.warn (full exponential backoff still D-4.2-backoff-full sibling) D-4.5-ui useBacktest exposes cancel() + auto-POSTs /cancel on unmount when status === 'running'; BacktestLab shows CANCEL button while running D-4.7-mcp-inprocess (MVP) MCP HTTP base URL configurable via FLINT_API_URL env var (was hardcoded 127.0.0.1:8000 in 7 sites). Full service-layer extraction = D-4.7-full Still deferred (16 items, full scoping in DEFERRED.md): - D-1.4-ui, D-2.1.{b,c,d}, D-3.{1,3,4,5}-rust, D-4.{2,3}-full, D-4.7-full, D-5.1-ruff, D-6.{1,4,5,6,7} - All blocked on dedicated multi-day work (god-class breakup, Rust ports, live-deploy API, event sourcing, Jito bundle integration) New files: docs/how-to/pin-notebook-fixtures.md Modified: 22 providers (PIT_METADATA added) flint/store.py (+5 _sql_* wrapper methods) flint/journal/storage.py + flint/paper/session_store.py (migrated) flint/api/routes/{backtest,paper}.py (+reconciliation route, sandbox gate) flint/providers/custom.py (+fills table parser) flint/mcp_server.py (FLINT_API_URL env override) rust/src/engine/positions.rs + runner.rs (FeeModel on close_all) ui/src/hooks/useBacktest.ts + pages/BacktestLab.tsx (cancel UI) ui/src/{hooks,pages,components}/* (10 files — silent-catch cleanup) README.md (wedge rewrite) DEFERRED.md (rewritten with closed/open split) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nals D-1.1.b made Rust close_all charge exit fees, which combined with Phase 1 T1.1.f's force-close-appends-terminal behavior produces per-strategy equity curves of length N+1 (N candles + one terminal after force-close). portfolio/engine.py used to iterate `for i in range(n_candles)`, silently dropping the terminal. Result: sum(per_strategy[-1]) == sum of terminals but combined[-1] == sum at bar N (pre-close). Test test_per_strategy_pnl_sums_align caught the divergence. Fix: use max curve length across strategies; extend shorter curves with their final value so the combined time series doesn't collapse tail entries to zero. Asserts through the 5-test portfolio suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs/specs/deferred-execution-plan.md — full delivery roadmap with dependency graph, 5 waves, per-item action checklist, single-engineer vs 3-engineer parallel sequences, risk register. Critical path: D-2.1.b (god-class breakup) → D-2.1.c (live-context merge) → D-6.5-api (live deploy) → D-6.7-jito (Jito bundles). Calendar: ~17 weeks single engineer, ~12 weeks 3 engineers. Wave breakdown: Wave 1 (weeks 1-3, 6 items): D-2.1.b + D-3.4-rust unlock dependencies; D-1.4-ui + D-4.2-backoff-full + D-4.7-full + D-5.1-ruff in parallel Wave 2 (weeks 4-5, 5 items): D-2.1.d, D-3.5-orchestrator, D-2.1.c, D-3.1-rust, D-3.3-maker-detection Wave 3 (weeks 6-9, 3 items): D-6.5-api (XL), D-4.3-websocket, D-6.6-proof Wave 4 (weeks 10-14, 2 items): D-6.7-jito + D-6.1-unified Wave 5 (weeks 14-17, 1 item): D-6.4-replay Cross-linked from IMPLEMENTATION_PLAN.md and DEFERRED.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-5.1-ruff (Wave 1): flip CI from soft `|| echo` to hard-fail on the curated rule set. pyproject.toml selects only F401/F811/F821/F841 (real bugs: unused imports, redefined names, undefined names, unused vars) and ignores the style noise (E402 from PIT_METADATA pattern, E501 long lines, E701/E702 colon/semicolon multi-statements, E741 single-letter names) until an editorial sweep happens. ruff --fix landed 315 mechanical cleanups across 100+ files. Remaining 26 manual fixes: - flint/store.py: TYPE_CHECKING block for lazy-imported model names - flint/{analytics,providers,strategy}: drop dead unused-var assignments - scripts/{backtest_funding_arb,populate_db}: drop unused vars - tests/test_*: drop debug result/missing/has_liq_warning assignments Targeted regression sweep on 8 modified-file test modules: 82/82 pass. WAVE_STATUS.md updated, D-5.1-ruff marked 🟢 shipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-4.7-full (Wave 1): factor the work-doing code out of FastAPI route handlers and the MCP server into a callable service layer that doesn't care whether a request came from HTTP, an MCP tool, or a script. New modules under flint/services/: - strategies — single source of truth for the 20-template builder map (was duplicated across backtest.py route and paper.py route) - backtest — run_backtest_sync(req, store) → tearsheet dict; the synchronous one-shot path for MCP and notebooks. Out of scope: progress callbacks, sandbox routing, multi-market aggregate, MC gating (those still go through the HTTP route). - journal — list/get/delete/compare wrappers over JournalStorage - data — OHLCV / funding / borrow / market metadata reads - paper — read-side helpers around PaperTradingEngine + a store-only fallback (`list_sessions_from_store`) for MCP queries when no daemon is running MCP refactor (flint/mcp_server.py): - run_backtest tool now calls run_backtest_sync directly — no HTTP - list_journal_runs / compare_runs use the journal service - get_paper_sessions falls back to the store fallback when HTTP fails - Paper start/stop still legitimately go over HTTP (need the asyncio daemon owning the live session loop) Routes thinned: - flint/api/routes/journal.py — pure adapter, ~25 lines - flint/api/routes/paper.py — _BUILDERS dict deleted (now in service); _build_strategy delegates to services.strategies.build_strategy - flint/api/routes/backtest.py — _build_strategy delegates to service; drops the 100-line builder map and 20 strategy-class top-level imports Tests: - tests/test_mcp_standalone.py — 12 tests covering each service module, plus three acceptance checks that mcp_server.py imports the service layer for the core tools (the spec's "no HTTP for backtest / journal / data" requirement) - 105/105 regression tests green across journal/paper/data/walk-forward paths WAVE_STATUS.md: D-5.1-ruff and D-4.7-full both 🟢. Next: D-2.1.b step 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step 1 of 7 in breaking up BacktestContext (current: 971 lines, owning positions, cash, fills, fees, funding, borrow, orderbook, OI, margin, allocator, capital transfers — the textbook god class). This commit pulls position-state ownership out into `flint/execution/position_manager.py:PositionManager` so the next slices (CashManager, FillRecorder, Risk surface, Funding/Borrow ledgers) have a stable boundary to compose against. Approach: BacktestContext keeps `self._pm = PositionManager()` and exposes `self._positions` / `self._closed_positions` as @Property aliases that return the manager's underlying mutable dict/list. That lets the 100+-line `_apply_fill` body and the rest of the call sites (apply_funding, check_liquidations, close_all_positions) keep working unchanged — they're still doing `self._positions[key] = pos` and `del self._positions[key]`. Steps 2–7 migrate those call sites to explicit `self._pm.set/delete/record_close` calls so the aliases can go away. PositionManager surface (kept narrow on purpose): - get / set / delete on (venue, market) keys - dict-like __contains__ / __iter__ / __len__ / keys / values / items - update_pnl_for_market(market, price) — bulk PnL refresh - record_close(record), closed property — closed-trade ledger - positions property — direct dict access for the margin engine and the future Rust-side adapter (commented as "treat as private") Tests: - tests/test_position_manager.py — 8 unit tests covering single-position ops, dict-like surface, PnL update for one-vs-many markets, closed ledger immutability, BacktestContext integration (legacy alias routes through manager) - 90/90 regression tests pass on tests touching BacktestContext (test_backtest_v2, test_context_portability, test_fill_pipeline, test_jupiter_backtest, test_multi_market, test_multi_venue_margin_integration, test_venue_fill_dispatch) WAVE_STATUS.md: D-2.1.b → 🟡 (1/7 shipped). Steps 2–7 stay scoped as follow-up work in the deferred backlog. Wave 1 next: D-1.4-ui. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-1.4-ui (Wave 1): POST /api/v1/paper/{session_id}/reconciliation accepts a multipart CSV of venue-reported fills, matches against the engine fills already persisted for that session, and returns the same counts + p50/p95/p99 bps/ts deltas the reconcile_fills CLI emits. Backend: - scripts/reconcile_fills.py — extract _parse_venue_fills_reader so both the path-based load_venue_fills_csv (CLI) and the new parse_venue_fills_csv_text (HTTP route) share validation. New CSVSchemaError lets the API return 400 instead of SystemExit-on-CLI vs 500-on-HTTP. - flint/api/routes/paper.py — POST handler caps uploads at 10 MB, rejects non-UTF-8 with 400, decodes the CSV, calls reconcile() over store.get_live_fills(session_id) + the venue fills. UI: - ui/src/pages/PaperTrading.tsx — hidden <input type="file"> + "RECONCILE FILLS" button next to the existing PARITY TEST button. Result panel mirrors the parity-report layout: counts grid (matched / engine-only / venue-only / venue-total) + bps/ts percentile grid, color-coded on the p95-bps > 10 threshold. - Schema/upload errors show a [RECONCILE] inline banner. - Same-file uploads work twice in a row (input.value cleared after). Tests: - tests/test_reconciliation_endpoint.py — 6 cases covering empty engine fills, schema-error → 400, oversized upload → 400, non-UTF-8 → 400, plus end-to-end reconcile() unit checks for ts-window match and out-of-window mismatch. - 20/20 paper-route regression tests still pass; ruff clean; UI vite build green. WAVE_STATUS.md: D-1.4-ui → 🟢. Wave 1 progress: 4/6 first-cuts shipped (D-5.1-ruff, D-4.7-full, D-2.1.b step 1, D-1.4-ui). Remaining: D-3.4-rust, D-4.2-backoff-full. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-4.2-backoff-full (Wave 1): replace ad-hoc setInterval/setTimeout loops in the polling hooks with one shared primitive that handles the boring boilerplate everyone gets wrong: - 1 s → 2 s → 5 s → 10 s → 30 s backoff schedule on consecutive errors - AbortController per request so an unmounting component never resolves a stale fetch and sets state on a dead hook - errorCount / lastError / nextRetryIn surfaced in the return so ConnectionBanner can show "retry in 5s" instead of pretending everything is fine Migrations in this commit: - useLiveMonitor: equity + fills polled together via Promise.all, friendly "is flint serve running?" string still applied - usePaperPortfolio: same friendly-error map, errorCount/nextRetryIn exposed for the dashboard banner - useSessionStatus: simplest case — drops directly to the new hook Skipped on purpose: - useBacktest, useOptimize — these are job-completion-driven (poll until status==complete then stop forever). They have their own poll cancellation against an explicit run id; useBackoffPoll's steady-state mental model is wrong for them and forcing a migration would regress the cancellation UX. Documented in WAVE_STATUS.md. Tests: - src/test/hooks/useBackoffPoll.test.ts — 5 cases covering happy path, enabled=false skip, errorCount escalation, recovery on later success, and abort-on-unmount. Real timers (RTL + fake timers don't compose cleanly). - 127/127 vitest pass; vite build clean. WAVE_STATUS.md: D-4.2-backoff-full → 🟢. Wave 1 progress: 5/6 first cuts shipped. Only D-3.4-rust (Rust port, requires cargo + PyO3 expertise) remains as a sibling-PR follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-3.4-rust (Wave 1): the last Wave 1 item. Moves the three venue cost models (Solana/Drift, Hyperliquid, CEX) into Rust so the hot fill loop doesn't round-trip through Python for every trade's cost breakdown. Rust side: - rust/src/engine/tx_costs.rs — CostEstimate struct (mirrors the Python dataclass field-for-field) + TxCostModel enum with three variants. Factory `for_venue()` replicates `get_tx_cost_model()` including the "unknown venue → CEX" fallback and case-insensitive matching. Historical p50/p90 lamport fees collapse from the Python dict into two Optional<u64> fields to avoid a HashMap on the hot path. - rust/src/lib.rs — PyO3 class `TxCostModel` with four static ctors (`for_venue`, `solana`, `hyperliquid`, `cex`) and an `estimate(market, size, price, urgency) → dict` method returning the exact shape of `CostEstimate.to_dict()`. - capabilities(): `supports_tx_costs` flipped from false to true. Tests: - 6 cargo tests in engine/tx_costs.rs covering default/urgent/fallback - tests/test_rust_tx_cost_parity.py — 13 Python↔Rust parity tests pinned to 1e-9 tolerance across every field. Covers default cases, urgent-vs-normal p90 selection, custom fee bps, unknown-venue fallback, case-insensitive venue lookup, and two edge cases (zero size still charges network fee; $1B notional stays inside tolerance). - Micro-benchmark: 200k iterations on the tight path → 137 ms Python vs 61 ms Rust (2.24×). FFI overhead dominates the single-call measurement — the real win is that the Rust engine's internal fill loop can call Rust-to-Rust without crossing the Python boundary at all, which unblocks the D-1.1.b/D-3.1 Rust fill-pipeline work. 47/47 Rust-suite tests green. ruff clean. WAVE_STATUS.md: D-3.4-rust → 🟢. Wave 1 complete (6/6 first cuts shipped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-3.3-maker-detection (Wave 2): tag FillResult.is_maker so passive limit fills pay the maker rate (or rebate) rather than the taker rate. Wired into the existing `FeeModel::Drift`/`Hyperliquid`/ `MakerTaker` variants that were shipped in Phase 3 T3.3 but unused. Rust side: - types.rs — FillResult gains `is_maker: bool`. All 12 constructors in engine/{fills,orders,venue_fills,positions,fees}.rs default it to false. - engine/orders.rs — `process_pending_orders` sets `is_maker: true` on Limit fills (resting orders filled by a later bar's range). Both `process_pending_orders` and `process_market_orders` now call `compute_fee_with_role(fill, fill.is_maker)` instead of the role-blind `compute_fee`. Market orders stay `is_maker=false`. PyO3: - `RustEngine.__init__` gains `fee_model` + `maker_bps` + `taker_bps` kwargs. `"flat"` (default) preserves the old flat-bps behavior. `"drift"` / `"hyperliquid"` pick the venue-specific schedules; `"maker_taker"` takes explicit bps. Capability flags `supports_maker_taker_fees` and the `fee_models` list are updated to reflect the new surface. Tests: - tests/test_rust_maker_detection.py — 6 behavioral cases: * Drift rebate nets ~$0.079 total (resting long at 95 rebates $0.019, market close at 98 pays $0.098 taker). Under a bug where maker wasn't detected, the total would double. * Hyperliquid (1 bp maker / 3.5 bp taker) lands in [0.03, 0.05]. * Flat fee path unchanged (maker tag set but not observable). * Explicit MakerTaker with maker=-5, taker=20 matches expected. * Capability flag assertions. - 163/163 Rust-suite regression green. ruff clean. WAVE_STATUS.md: D-3.3-maker-detection → 🟢 (Wave 2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-3.1-rust (Wave 2): the hottest path on the orderbook-aware fill pipeline now runs in Rust. One `flint_core.OrderbookFiller` call replaces the per-fill Python loop over book levels. Rust side: - rust/src/engine/orderbook_fill.rs — `BookSnapshot` { bids, asks }, `OrderbookFiller { reject_on_insufficient_depth }`, `walk_market(side, size, book) → Option<OrderbookWalk>` where `OrderbookWalk { price, size, impact_bps, is_partial }`. Algorithm mirrors `OrderbookFillModel._walk_book` line-for-line: * pick asks for Long, bids for Short * reject when order_size > total depth on the taking side (when `reject_on_insufficient_depth` is true — the default) * walk levels in order, accumulate size × price, compute VWAP * signed impact_bps = (vwap - mid) / mid × 10_000, positive = taker-unfavorable; mid falls back to the one-sided price when only one side of the book has levels. - PyO3: `flint_core.OrderbookFiller` class with `walk_market(side, size, bids, asks) → dict | None`. Callers pre-sort their book snapshots (bids descending, asks ascending) just like the Python `OrderbookSnapshot` dataclass enforces today. - Capability flag `supports_orderbook_walk` flipped to true. Tests: - 9 cargo tests in engine/orderbook_fill.rs (long/short VWAP, reject, partial, empty, impact_bps sign on both sides, one-sided mid, empty mid) - tests/test_rust_orderbook_parity.py — 13 Python↔Rust parity cases across varied order sizes, rejection vs partial behavior, impact sign verification, and capability flag. - Micro-benchmark (20 levels/side, 100k iterations): Python 280 ms vs Rust 80 ms ≈ 3.52× speedup. Unlike the TxCost port's FFI-bound result, this one gets real gains because each walk allocates and iterates per fill. 176/176 Rust-suite regression green. ruff clean. WAVE_STATUS.md: D-3.1-rust → 🟢. Wave 2 progress: 2/5 items shipped (D-3.3-maker-detection + D-3.1-rust); remaining 3 are blocked on later D-2.1.b steps or on testnet secrets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step 2 of 7 in breaking up BacktestContext. After Step 1 pulled position state into PositionManager, this slice does the same for cash + running counters. flint/execution/cash_manager.py: - Single-ledger `CashManager` owning `cash`, optional `allocator`, and the three running counters (`total_fees`, `total_tx_costs`, `total_funding`). `__slots__` keeps it lean. - `debit(amount, venue)` and `credit(amount, venue)` mirror the pre-extraction `_debit_cash` / `_credit_cash` helpers exactly, including the allocator's `track_pnl` call on credit so per-venue PnL ledgers stay in sync. - `available()` and `balances()` cover the venue-balance helpers the BacktestContext exposed for tests. BacktestContext changes: - `__init__` constructs `self._cm = CashManager(initial_capital, allocator=capital_allocator)` instead of holding `self._cash`, `self._allocator`, and the three counters as plain attributes. - Property aliases `_cash` (read+write), `_allocator` (read-only), `_total_fees` / `_total_tx_costs` / `_total_funding` (read+write) keep the 20+ existing call sites working: `self._cash -= x` and `self._total_funding += p` route through the property setters back into the manager. Steps 3–7 migrate those to explicit `self._cm.debit/credit/add_*` calls. Tests: - tests/test_cash_manager.py — 16 cases covering init (with and without allocator), debit/credit (allocator-aware rejection, per-venue update, track_pnl), counter accumulation, compound assignments, balance helpers, and BacktestContext integration (legacy alias routes through manager, no class-level state pollution between instances). - 134/134 regression tests on BacktestContext-using paths still pass (test_backtest_v2, test_fill_pipeline, test_jupiter_backtest, test_multi_market, test_multi_venue_margin_integration, test_venue_fill_dispatch, test_position_manager, test_tx_costs, test_funding_arb, test_funding_dislocation_arb, test_midnight_gardener, test_safety_integration). - ruff clean. WAVE_STATUS.md: D-2.1.b → 2/7 shipped (still 🟡). Steps 3–7 (FillRecorder, FundingLedger, BorrowLedger, OrderbookCache, Risk surface) remain as follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step 3 of 7 in breaking up BacktestContext. After PositionManager (step 1) and CashManager (step 2), this slice pulls the two append-only ledgers (recorded fills + diagnostic log messages) into their own owner. flint/execution/fill_recorder.py: - `FillRecorder` with `__slots__` for the two lists. Surface: * `fills` (mutable list, used by hot-path call sites) * `record(fill)` and `all_fills()` for new code * `logs` (mutable list) * `log(msg)` and `messages()` for new code BacktestContext changes: - `__init__` constructs `self._fr = FillRecorder()` instead of holding `self._fills` / `self._log_messages` as plain lists. - Read-only property aliases `_fills` and `_log_messages` return the manager's underlying mutable lists, so existing call sites (`self._fills.append(...)` in _apply_fill, `self._log_messages .append(...)` in market_order's reduce_only / margin paths, log() helper, check_liquidations) keep working unchanged. - Public `all_fills` and `log_messages` properties read through the manager. Tests: - tests/test_fill_recorder.py — 10 cases covering record/log ordering, copy semantics on snapshots, legacy-append-via-property, BacktestContext integration (recorder ownership, alias routing, public-property reads), and per-instance isolation. - 127/127 regression tests on BacktestContext-using paths still green. ruff clean. WAVE_STATUS.md: D-2.1.b → 3/7 shipped (still 🟡). Steps 4–7 (OrderQueue, FundingLedger, BorrowLedger, Risk surface) remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step 4 of 7 in breaking up BacktestContext. Pulls the two order queues (resting limit/stop/TP orders + this-bar market orders) into a single owner. flint/execution/order_queue.py: - `OrderQueue { pending, market_queue, pending_cap }` with `__slots__`. Default cap of 100 mirrors the pre-extraction `len >= 100 → drop` behavior in `_check_order_cap`. - Surface: * `add_pending(order) → bool` — False at cap (caller drops) * `cancel(order_id) → bool` * `cancel_all(market=None) → int` — count removed * `pending` / `market_queue` properties read+write * `add_market(order)` and `drain_market()` (atomic swap so orders submitted during drain go to the fresh queue) * `snapshot()` for `BacktestContext.pending_orders` semantics BacktestContext changes: - `__init__` constructs `self._oq = OrderQueue()` instead of two plain `List[Order]`. - `_pending_orders` and `_market_orders_queue` become read+write property aliases. The setters are load-bearing — `cancel_all` and `process_pending_orders` rebuild via `self._pending_orders = [filtered list]`, and the setter routes that into `self._oq.pending = ...` so the manager keeps a single reference. Tests: - tests/test_order_queue.py — 16 cases covering append/cap, cancel by id, cancel_all (no-arg + by market), market-queue drain semantics, list reassignment via setters, BacktestContext integration (legacy append, legacy reassignment, public `pending_orders` property), and per-instance isolation. - 134/134 regression tests on BacktestContext-using paths still green. ruff clean. WAVE_STATUS.md: D-2.1.b → 4/7 shipped (still 🟡). Steps 5–7 (FundingLedger, BorrowLedger, Risk surface) remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step 5 of 7 in breaking up BacktestContext. Pulls the two funding dictionaries (flat per-market history + per-venue split) into a dedicated owner with the strategy-facing helpers built in. flint/execution/funding_ledger.py: - `FundingLedger` owns `_history: Dict[market, List[FundingRate]]` and `_venue_history: Dict[market, Dict[venue, List[FundingRate]]]` via `__slots__`. - Surface mirrors the pre-extraction get_funding_* API: * `add(fr)` — single source of truth for venue-tagged appends * `latest(market)` → Optional[float] * `recent(market, lookback)` → List[(ts, rate)] * `by_venue(market, lookback)` → Dict[venue, List[(ts, rate)]] * `venue_snapshots(market, lookback)` → full FundingRate objects (used by funding_arb / funding_dislocation_arb for mark/oracle prices, not just rate) BacktestContext changes: - `__init__` constructs `self._fl = FundingLedger()` instead of two bare dicts. - `add_funding_rate` is now a one-line `self._fl.add(fr)`. - `get_funding_rate`, `get_funding_rates`, `get_funding_by_venue`, `get_venue_snapshots` collapse to one-line ledger calls (the per-method market resolution stays for backward-compat). - Read-only property aliases `_funding_history` and `_venue_funding` return the ledger's underlying dicts so any existing test that peeks into internals still passes. Tests: - tests/test_funding_ledger.py — 13 cases covering empty/latest/ lookback truncation, venue grouping, full-snapshot access, and BacktestContext integration (add/get round-trips, legacy alias reads, per-instance isolation). - 125/125 regression tests green across funding-using paths (test_funding_arb, test_funding_dislocation_arb, test_paper_funding, test_paper_multi_venue, test_backtest_v2 etc.). ruff clean. WAVE_STATUS.md: D-2.1.b → 5/7 shipped (still 🟡). Steps 6–7 (BorrowLedger, Risk surface) remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step 6 of 7 in breaking up BacktestContext. Owns the Jupiter Perps borrow-rate history plus the running paid-borrow counters and the per-trade payment ledger that `_apply_fill` writes into for tearsheet attribution. flint/execution/borrow_ledger.py: - `BorrowLedger { _history, _payments, total_paid }` with `__slots__`. - Surface: * `record(snapshot)` — single source of truth for rate appends * `record_payment(payment)` — per-trade attribution dict * `add_paid(amount)` — running counter helper * `latest(market)` → most recent rate_hourly * `recent(market, lookback)` → List[(ts, rate_hourly)] * `cumulative_at(market, ts)` — value at-or-before ts (linear scan because snapshots are append-ordered by arrival, same as the pre-extraction code) BacktestContext changes: - `__init__` constructs `self._bl = BorrowLedger()` instead of two separate fields + a list. - `add_borrow_rate`, `get_borrow_rate`, `get_borrow_rates`, and `get_borrow_cumulative_at` collapse to one-line delegations. - Read-only property aliases `_borrow_history` and `_borrow_payments` return the ledger's underlying containers; `_total_borrow_paid` is read+write (the load-bearing case is `_apply_fill`'s `self._total_borrow_paid += borrow_cost`). - Public `total_borrow_paid` property still reads from the ledger. Tests: - tests/test_borrow_ledger.py — 13 cases covering record/latest/ lookback/cumulative_at boundary behavior, payment append, total_paid accumulation, BacktestContext integration (compound assignment via alias, public property reads), per-instance isolation. - 98/98 regression tests green across borrow-using paths (test_jupiter_backtest, test_backtest_v2, the four step-1..5 ledger test files). ruff clean. WAVE_STATUS.md: D-2.1.b → 6/7 shipped (still 🟡). Step 7 (Risk surface) is the last slice; after it the BacktestContext is a thin orchestrator over six small components. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Step 7 of 7 in breaking up BacktestContext. Pulls the three market- data caches (cross-market candle histories, orderbook snapshots, open-interest snapshots) into one owner. After this commit, every piece of mutable state that BacktestContext used to own has moved into one of seven dedicated managers. flint/execution/market_data_feed.py: - `MarketDataFeed` owns: * `_market_histories: Dict[str, List[Candle]]` (multi-market candle access used by check_liquidations, set_candle's cross-market PnL refresh) * `_orderbook_history: Dict[str, List]` (per-market book snapshots used by the orderbook fill model) * `_oi_history: Dict[str, List]` (per-market OI snapshots) - Surface: `set_histories(d)`, `candles(market, lookback)`, `markets()`, `add_orderbook(snap)`, `latest_orderbook(market)`, `add_open_interest(oi)`, `latest_oi(market)` → `Optional[(long_oi, short_oi)]`, `oi_recent(market, lookback)`. - `market_histories` exposed as read+write because the engine occasionally reassigns the dict directly (e.g. `set_candle`'s fallback path on cross-market lookups). BacktestContext changes: - `__init__` constructs `self._mdf = MarketDataFeed()` instead of three plain dicts. - `set_market_histories`, `get_candles`, `markets`, `add_orderbook_snapshot`, `get_orderbook`, `add_open_interest`, `get_open_interest`, `get_open_interest_history` collapse to one-line delegations. - Property aliases `_market_histories` (read+write), `_orderbook_history`, `_oi_history` (read-only) preserve every legacy access pattern (`for mkt, hist in self._market_histories.items()` in check_liquidations etc. keeps working unchanged). Tests: - tests/test_market_data_feed.py — 12 cases covering empty/set/ iterate, lookback truncation, orderbook/OI round-trips, BacktestContext integration (legacy alias reassignment, public property reads), per-instance isolation. - 176/176 regression tests green across all BacktestContext-using paths. ruff clean. WAVE_STATUS: D-2.1.b shows 7/7 state-extraction steps shipped (still 🟡 because caller-site migration to explicit manager calls + the "BacktestContext < 300 LOC" reduction land in a follow-on PR — the manager surfaces are already in place to support it). This loop's progress: 5 D-2.1.b slices shipped (steps 3–7). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-3.5-orchestrator (Wave 2): single pre-trade check facade composing the three pre-existing engines (MarginEngine, VenueAllocator, PortfolioRiskEngine). BacktestContext.market_order now consults one object instead of inlining venue-margin checks. flint/risk/portfolio_orchestrator.py: - `PortfolioMarginEngine(margin=, allocator=, portfolio=)` — any sub-engine is optional; omitting it makes the corresponding check a no-op pass so existing backtest paths (margin-only, no allocator, no portfolio risk) stay behaviorally identical. - `check_order(order, cash, positions, price, equity) → PortfolioCheck(approved, reason, component)` runs the three checks in priority order: 1. Allocator — fastest reject (per-venue available cash) 2. MarginEngine — venue-level margin/leverage cap 3. PortfolioRiskEngine — book-level gross/net/concentration/VaR First failure short-circuits; `component` carries the source name so callers can name which engine vetoed. - `check_liquidations(positions, prices, ts)` and `check_kill_switch(equity)` delegate to the relevant sub-engine (or return [] / False when omitted). BacktestContext changes: - New `portfolio_risk=None` ctor kwarg threads a PortfolioRiskEngine into the orchestrator. - `__init__` now always builds `self._pme` (None-tolerant internally), so the pre-trade gauntlet has a stable callable. - `market_order` replaces the inline margin block with a single `self._pme.check_order(...)` call. The reduce_only short-circuit stays — closing exposure can't fail margin/book checks. - Reject log line tags the originating component (`MARGIN REJECTED` / `ALLOCATOR REJECTED` / `PORTFOLIO REJECTED`) instead of the previous flat `MARGIN REJECTED` regardless of cause. Tests: - tests/test_portfolio_orchestrator.py — 16 cases: * No-op pass when all three engines unset * PortfolioCheck.__bool__ / reason / component * Margin rejection (oversized notional) * Allocator short-circuits before margin engine even runs * Portfolio gross-exposure cap rejects * Priority order proven by stacking failures * BacktestContext integration: orchestrator constructed, market_order log line carries component tag, success path reaches the queue, portfolio_risk ctor kwarg threads through * reduce_only bypasses every check (closing exposure path) - 207/207 regression tests green across margin/portfolio/safety/ funding/multi-venue/jupiter paths. ruff clean. WAVE_STATUS: D-3.5-orchestrator → 🟢. Wave 2 progress so far in the autonomous loop: D-3.3-maker, D-3.1-rust, all 7 D-2.1.b state slices, D-3.5-orchestrator. The remaining Wave 2 items (D-2.1.c live-context merge, D-2.1.d paper-context split) are blocked on testnet secrets / a deliberate paper API design pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-2.1.b caller-site migration (final slice). After all 7 state extractions landed last loop, this commit walks through every state-mutating call site in BacktestContext and routes it through the right manager directly — no more `self._cash -= x`, no more `self._positions[k] = ...`, no more `self._total_funding += p`. Migrations: - apply_funding: iterates `self._pm.items()` instead of the legacy dict alias; `self._cm.debit(payment, venue)` and `self._cm.add_funding(payment)` replace the `_debit_cash` helper + compound assignment. - check_liquidations: uses `self._pm.get/delete/record_close` for position state, `self._cm.credit/add_fee` for cash, the `self._oq.pending = filtered` setter for pending-order cleanup, and `self._fr.log` for the LIQUIDATED warn line. - process_pending_orders: reads `self._oq.pending`, writes back via the setter (replaces the legacy reassignment idiom). - process_market_orders: `self._oq.drain_market()` atomic swap (so fills queued during the loop go to the next bar), `self._fr.log`, and a new `_add_pending_or_warn` helper for GTC resting drains. - _apply_fill: factored two helpers — `_realize_jupiter_borrow_cost` (handles partial + full close + flip cases identically) and `_new_position` (Jupiter cum-borrow snapshot at entry). Main body is now ~80 lines of pure routing through `_fr.record / _cm.debit/credit/add_fee/add_tx_cost / _pm.set/get/ delete/record_close / _bl.add_paid/record_payment`. - close_all_positions: `self._pm.keys() / get((v, m))`. - set_candle: `self._pm.update_pnl_for_market` + `_mdf.history_for` (the cross-market PnL mark). - limit_order/stop_order/take_profit_order/cancel/cancel_all: route through `self._oq.add_pending/cancel/cancel_all`. - account/positions/pending_orders: manager-direct reads. - total_fees/total_funding/total_tx_costs/total_borrow_paid/ log_messages: manager-direct reads. - venue_balance/balances/transfer/process_transfers: `self._cm.available/balances/allocator`. - Internal `_debit_cash` and `_credit_cash` helpers deleted. API additions: - Public `borrow_payments` property so `flint/backtest/engine.py` no longer reaches into `ctx._borrow_payments` private state. Legacy property aliases (_cash/_positions/_fills/etc.) stay because existing tests deliberately exercise them; new code should not use them. Tests: - 262/262 regression tests green across: test_backtest_v2, test_multi_venue_margin_integration, test_jupiter_backtest, test_safety_integration, test_position_manager, test_cash_manager, test_fill_recorder, test_order_queue, test_funding_ledger, test_borrow_ledger, test_market_data_feed, test_portfolio_orchestrator, test_funding_arb, test_funding_dislocation_arb, test_multi_market, test_venue_fill_dispatch, test_context_portability, test_fill_pipeline, test_tx_costs, test_tx_cost_integration, test_paper, test_paper_funding, test_paper_multi_venue. - ruff clean. WAVE_STATUS: D-2.1.b → 🟢 (state extraction + caller migration). This closes the deepest D-2.1.b unlock; D-2.1.c (live-context merge) and D-2.1.d (paper-context split) remain blocked on testnet secrets / a deliberate paper-API design pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wave 3 first-cut. The existing PortfolioEngine runs each strategy on its own slice of capital — that's a portfolio of independent runs, not a shared-capital book. This new engine puts every strategy on **one** BacktestContext so cash, fees, funding, borrow, and the orchestrator's pre-trade margin gauntlet all see the actual book exposure. Unblocked by D-2.1.b (state extraction + caller migration) and D-3.5-orchestrator (PortfolioMarginEngine). The seven managers inside BacktestContext now make composing one for many strategies straightforward — a single CashManager debits across all strategy fills; a single PositionManager nets long+short across strategies; a single PortfolioMarginEngine sees combined exposure for caps. flint/portfolio/shared_engine.py: - `_TaggedContextProxy` wraps the shared `BacktestContext` per strategy: * Forwards all reads (account, positions, funding/borrow/ orderbook queries, candles, log) — strategies see the whole book, by design. * Tags every order_id with `strategy_name:` so fills carry attribution back to the originating strategy. * `cancel_all(market=...)` only cancels orders owned by the calling strategy (matched by tag prefix). - `SharedCapitalPortfolioEngine.run(candles)` walks the bar loop: set_candle → process_pending → strategy callbacks (each via its proxy) → process_market_orders → record equity. End-of-run close_all_positions for clean attribution. - `SharedPortfolioResult` carries combined equity + per-strategy trade counts, fill streams, and PnL splits, plus warnings surfaced from the shared ctx.log_messages. Tests: - tests/test_shared_capital_portfolio.py — 8 cases: * Empty strategy list rejected * No-candles run returns initial capital * Two strategies firing once each both produce trades against the shared book * Fill streams are tagged per-strategy * Proxy forwarding (account, market_order id format) * `cancel_all` is per-strategy-scoped * Warnings list propagates from shared ctx - 270/270 regression tests green across backtest/multi-venue/ jupiter/funding/paper/portfolio paths. ruff clean. Out of scope (follow-on PR): - Per-strategy capital caps (would need an allocator with tagged sub-buckets). - Closed-trade attribution by trade-id rather than the current even-split-by-market heuristic. - Dollar-neutral rebalancing across strategies. WAVE_STATUS: D-6.1-unified → 🟡 (foundation; refinements deferred). Wave 3 progress: 1 of 4 items started. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wave 5 first slice. Lays the storage primitives that future snapshot + time-travel-replay features will read. This commit ships only the **append + read** path; engine writer hooks, snapshot compaction, and the actual replay primitive land in follow-on slices once the writer is exercised by real engine runs. flint/portfolio/event_log.py: - New `portfolio_events(session_id, seq, ts, kind, payload)` table. Composite PK enforces monotonic seq per session without DuckDB AUTOINCREMENT (which can't reset per-group). Index on (session_id, ts) for the future `read_until(target_ts)` replay path. - `EventKind` constants: `order.submit`, `order.cancel`, `fill`, `liquidation`, `funding`, `borrow`. Payloads are JSON for forward compatibility — schema can grow new optional fields without a table migration. - `EventLogWriter` is thread-safe: a `threading.Lock` serializes all DB-touching ops; per-session `next_seq` cache backed by `MAX(seq)+1` query on cold start so process restarts pick up cleanly. Exposes `append(...)` + `append_many(...)` (bulk variant using DuckDB executemany). - `EventLogReader` provides `read_all`, `read_until(target_ts)` (filter by event-time, ordered by seq), `count`, `latest_seq`. - `PortfolioEvent` dataclass with `to_row()` / `from_row()` for the JSON ↔ Python round trip. Tests: - tests/test_event_log.py — 15 cases covering: * Schema idempotency (table created, double-init no-op) * Seq starts at 0 per session, monotonic within, independent across sessions * Payload JSON round-trips with nested dicts/lists * append_many assigns consecutive seqs and continues from individual appends * Cross-writer seq recovery (process-restart simulation) * read_all orders by seq (not ts), read_until filters by ts * Thread safety: 8 threads × 25 events each → dense 0..199 sequence with no duplicates or gaps * PortfolioEvent dataclass round-trip ruff clean. WAVE_STATUS: D-6.4-replay → 🟡 (foundation; snapshot + replay deferred). Wave 3 progress so far: D-6.1-unified foundation + D-6.4-replay foundation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Slice 2 of 5 for D-6.4-replay. Builds on the event-log foundation (slice 1) by adding the fold + replay functions that turn an event stream into a `BookState` at any target_ts. flint/portfolio/replay.py: - `BookState` dataclass owns: cash, initial_capital, total_fees, total_tx_costs, total_funding, total_borrow_paid, positions dict (per (venue, market)), realized_pnl, and three event counters (fill / liquidation / order_submit / order_cancel). - `BookState.equity_at(prices)` computes equity given a mark-price map. Positions without a price contribute 0 unrealized PnL — callers feed prices for the markets they care about. - `fold(events, initial_capital) → BookState` is pure Python over any iterable. Used by tests and the future snapshot compactor without needing a store. - `replay(store, session_id, target_ts, initial_capital)` reads `EventLogReader.read_until(target_ts)` and folds. Slice 3 will accept an optional `from_snapshot=BookState` for compaction. Fill semantics match `BacktestContext._apply_fill`'s post-migration code one-for-one: - Open : new (venue, market) entry - DCA : same-side adds → average entry price - Partial : opposite-side smaller → realize size×Δprice, shrink - Full : opposite-side equal → realize, drop - Flip : opposite-side larger → realize, drop, open remainder on new side at fill price Funding / liquidation / borrow folders mirror the engine's cash-debit semantics. Unknown event kinds are ignored — forward compat for schema growth (existing logs replay against newer code without a migration). Tests: - tests/test_replay.py — 17 cases: * Empty fold * Fill: open / DCA / partial / full / flip / short-PnL-sign * Funding debit (positive + negative payments) * Liquidation drops position + books loss + penalty fee * Borrow cost debits cash + counter * Order submit/cancel counter increments * Unknown kind silently ignored (forward compat) * `equity_at(prices)` includes unrealized PnL; missing markets contribute 0 * Storage-backed `replay()`: filters by ts, isolates by session_id, deterministic across two calls * Full-lifecycle scenario: open → funding → borrow → close → verify final cash matches expected ledger - 32/32 between event_log + replay tests. ruff clean. WAVE_STATUS: D-6.4-replay slice 2 noted. Snapshot compaction (slice 3), engine writer hooks (slice 4), time-travel UI (slice 5) remain. Wave 5 progress: 2/5 slices. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ice 3) Slice 3 of 5 for D-6.4-replay. Long-running sessions accumulate millions of events; replaying from seq=0 every time would be O(N). This slice caches periodic full-state snapshots so replay can fast-forward past the early events. flint/portfolio/snapshots.py: - New `portfolio_snapshots(session_id, seq, ts, payload)` table. Composite PK so the compactor can `INSERT OR REPLACE` idempotently on re-runs. Index on (session_id, ts) for `latest_before`. - `_state_to_json` / `_state_from_json` serialize the full BookState including positions. The (venue, market) tuple keys flatten to "venue|market" strings for JSON compat. - `SnapshotStore { write(session_id, seq, ts, state), latest(session_id), latest_before(session_id, target_ts), count(session_id) }` — thread-safe via a lock shared with all DB-touching ops. - `should_compact(events_since_last, every_n_events=10_000)` predicate that engine writer hooks (slice 4) will call. flint/portfolio/event_log.py: - `EventLogReader.read_after_seq(session_id, after_seq, target_ts)` fetches only events with `seq > after_seq AND ts <= target_ts` — the tail past a snapshot. - Reader constructor now bootstraps the events table idempotently via `CREATE IF NOT EXISTS`, so replay over a fresh DB doesn't fail with `CatalogException` when the writer hasn't run yet. flint/portfolio/replay.py: - `fold(events, initial_capital, seed=None)` — when `seed` is given, fold continues from that state (fast-forward path) and the `initial_capital` arg is carried by the seed. - `replay(..., use_snapshot=True)` (default): query `SnapshotStore.latest_before(target_ts)`; on hit, fold only the tail starting from the snapshot. On miss, falls through to read-from-zero. `use_snapshot=False` forces full replay (used by the parity test). Tests: - tests/test_snapshots.py — 15 cases: * BookState ↔ JSON round-trip preserves every field, including multiple positions across venues. * Empty BookState round-trips (no positions). * Schema idempotent (CREATE IF NOT EXISTS). * write/latest/count semantics. * Per-session isolation. * Upsert overwrites same (session_id, seq). * `latest_before` returns the most recent snapshot at-or-before the target_ts; None when no qualifying snapshot exists. * **Load-bearing**: replay(use_snapshot=True) and replay(use_snapshot=False) produce byte-identical state on the same target_ts. * Replay falls back to read-from-zero when target_ts is before every available snapshot. * Replay over an unknown session returns the initial state (table-bootstrap path). * fold(seed=...) carries the seed forward; mutates in place. * `should_compact` threshold predicate (default 10k + custom). - 47/47 across event_log + replay + snapshots. ruff clean. WAVE_STATUS: D-6.4-replay slices 1+2+3 shipped. Engine writer hooks (slice 4) + time-travel UI (slice 5) remain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Slice 4 of 5 for D-6.4-replay. Wires the BacktestContext to emit events to the EventLogWriter so replay can reproduce final state byte-for-byte from a real backtest. Approach: opt-in. The two new constructor kwargs (`event_log_writer`, `event_session_id`) default to None; when either is unset, the `_emit(...)` helper short-circuits and the legacy path pays zero overhead. When both are set, every state-mutating operation appends one event to the log. Hooks emit on: - order.submit: market_order / limit_order / stop_order / take_profit_order (all four order entrypoints, after the orchestrator gauntlet passes) - order.cancel: cancel(oid) on success, cancel_all() when n > 0 - fill: every `_apply_fill` invocation, emitted *before* position mutation so replay's fold sees opens/closes in the same order - funding: every per-position `apply_funding` payment - liquidation: every margin-engine force-close (after the loss is booked) - borrow: every `_realize_jupiter_borrow_cost` debit Event payloads carry enough state for replay to reconstruct the book exactly (market, venue, side, size, price, fee, tx_cost on fills; payment + rate on funding; loss + penalty on liquidation; cum_entry + cum_exit on borrow). The `ts` field uses the *event* time, not the write time — fills carry `fill.ts`, funding carries `funding_rate.ts`, etc., so `replay(target_ts=...)` filters correctly across cross-bar events. Tests: - tests/test_event_log_engine_hooks.py — 12 cases: * Legacy ctx (no writer) emits zero events * Writer without session_id is also a no-op * Every order kind emits the right payload (type tag, price/trigger_price as appropriate) * cancel emits only when an order was actually cancelled * fill payload carries market/venue/side/size/price * **End-to-end parity (load-bearing)**: replay over the live-emitted log produces `replayed.cash == ctx.account.cash` for open-then-close, 5-fill DCA, and partial-replay-at-intermediate-ts scenarios * Funding payment emits with payment field equal to size × oracle_price × rate - 329/329 regression green across backtest, multi-venue, jupiter, funding, paper, portfolio, and the four replay-suite files. ruff clean. WAVE_STATUS: D-6.4-replay slices 1-4 shipped (4/5). Time-travel UI (slice 5) is the last piece — that's a UI-side concern, deferred to when the user wants to expose the time-travel surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-on-reconnect Wave 3 first slice for D-4.3-websocket. Server-side: per-session endpoints, monotonic seq stamping, replay buffer, heartbeat ping. Client-side: typed `useWebSocket<T>` hook with the same exponential backoff as `useBackoffPoll` + heartbeat-stale detection. flint/api/websocket.py: - `ConnectionManager` extended with: * Per-channel monotonic `seq` counter; every broadcast envelope carries `{channel, seq, ...payload}`. * 500-deep `deque` ring buffer per channel (`_REPLAY_BUFFER_SIZE`) of `(seq, ts, payload_str)` for replay on reconnect. * `connect(websocket, channel, since_seq=N)` replays every buffered event with `seq > N` before streaming live → at-least-once delivery on flap-and-reconnect. * `ping(channel)` broadcasts `{type: "ping", ts}` for heartbeat. * `disconnect`, `channels()`, `buffered_count(channel)` for introspection. flint/api/main.py: - New per-session WS routes: * `/ws/paper/{session_id}` → channel `paper:{id}` * `/ws/live/{session_id}` → channel `live:{id}` - Shared `_ws_loop` helper pulls `?since=<seq>` from query params for replay opt-in. Legacy `/ws/{channel}` kept for back-compat. ui/src/hooks/useWebSocket.ts: - Typed `useWebSocket<T>(path, opts)` hook returning `{data, status, lastSeq, errorCount, lastError}`. Status enum: `connecting | open | closed | error`. - Reconnect backoff schedule mirrors `useBackoffPoll`: 1s → 2s → 5s → 10s → 30s on consecutive failures; resets on first successful open. Tracks `lastSeqRef` across reconnects so the next attempt sends `?since=<lastSeq>` for replay. - Heartbeat: timer rearms on every incoming message; 30s of silence forces `ws.close(4000, 'stale')` which routes through the normal reconnect path — bounds the worst-case stale-data window when an intermediate proxy silently drops the socket. - `ping` envelopes are consumed internally (rearm heartbeat) and not bubbled up to the consumer's `data`. Tests: - tests/test_websocket_replay.py — 10 cases: * Monotonic seq stamping (per-channel independence) * `all` channel receives every broadcast * Dead socket pruned on first failed send * `since_seq=N` replays seqs > N (and replays nothing when None) * Ring buffer caps at `_REPLAY_BUFFER_SIZE` even past N+1 events * `ping(channel)` broadcasts a heartbeat with `type: "ping"` * Connection introspection: count, channels(), disconnect 127/127 vitest still green; vite build clean; 10/10 backend WS tests; ruff clean. WAVE_STATUS: D-4.3-websocket → 🟡 (foundation). Remaining work (deferred): engine-side broadcast hooks (paper/live engines emit equity/fill ticks to their channels) + migration of useLiveMonitor/usePaperPortfolio/useSessionStatus from polling to WS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-4.3-websocket slice 2. The foundation slice landed the endpoints, manager, and `useWebSocket<T>` hook. This slice wires the paper engine to actually emit ticks so subscribed clients see live equity movement instead of polling. flint/paper/engine.py: - `PaperTradingEngine` gains a `ws_manager` attribute (defaults to None — unit tests + `flint backtest` continue to construct the engine without a manager and pay zero overhead). - `_run_live_session` per-bar loop, after the equity snapshot is built, broadcasts `{type: "tick", ts, equity, cash, unrealized_pnl, is_replay, total_trades}` to channel `paper:{session_id}`. Wrapped in try/except at debug log level so a flaky ws never tanks the engine's tick. flint/api/main.py: - Lifespan startup wires `paper_engine.ws_manager = ws_manager` after both are constructed. Comment notes the dependency direction so the next refactor doesn't accidentally invert it. Tests: - tests/test_paper_engine_ws_broadcast.py — 5 cases: * Engine default `ws_manager` is None * Engine accepts a manager assignment post-init * Broadcast envelope shape matches what the engine emits (channel, seq, type, ts, equity, total_trades) * Subscribers only receive their own session's ticks * Broken sockets don't propagate exceptions out of the manager - 47/47 regression green across paper/paper-funding/paper-multi-venue + the new ws-broadcast file. ruff clean. WAVE_STATUS: D-4.3-websocket → still 🟡 (paper engine broadcast + foundation shipped; live engine broadcast + hook migration + PaperTrading.tsx panel binding remain). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Continues the D-4.3-websocket buildout. After slice 2 added per-bar equity ticks, this slice adds the trade + fill streams so subscribers see *every* state change as it happens, not just the next equity snapshot. flint/paper/engine.py: - After persisting `new_trades` from the broker, the engine now broadcasts each one to `paper:{session_id}` as `{type: "trade", ...trade_dict}`. Reuses the same try/except swallow as the tick broadcast — a flaky ws never tanks the loop. flint/execution/live_base.py: - `LiveExecutionContext.__init__` gains a public `ws_manager` attribute (defaults to None — every concrete subclass and every unit test continues to construct without one). - `_handle_fill(order_id, fill)` broadcasts to `live:{session_id}` with `{type: "fill", order_id, market, venue, side, price, size, fee, ts}` when both `ws_manager` and `_session_id` are set. Uses `asyncio.ensure_future` so the fill flow stays synchronous from the OrderTracker's perspective; broadcast failures are swallowed at debug level so persistence/risk-guard paths can't be blocked by network hiccups. Tests: - tests/test_live_context_ws_broadcast.py — 4 cases on a stub LiveExecutionContext subclass: * Default `ws_manager` is None * `_handle_fill` broadcasts when manager + session_id set; envelope shape verified (channel, type, market, side, price) * Broken broadcast doesn't propagate — fill flow continues * Empty session_id skips the broadcast even if manager is set - 33/33 pass across the WS-related test files; 27/27 pass on the separate live-context regression files (test_multi_venue_live, test_live_context_data). ruff clean. WAVE_STATUS: D-4.3-websocket → still 🟡 (foundation + paper tick + paper trade + live fill all live; hook migration to drop polling + UI panel binding remain). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wires SessionDetail to the per-session WebSocket so the equity / unrealized PnL / trade count update on every tick instead of every 2s poll. Augments rather than replaces polling — the polled SessionStatus still drives the full state (margin, fees, status phase, equity history endpoint), and the WS just overlays the fast-moving fields. ui/src/pages/PaperTrading.tsx: - New `useWebSocket<PaperWsTick>(/ws/paper/${sessionId})` subscription. - `wsTick = ws.data when type === 'tick'`. The metrics grid now reads `liveEquity = wsTick?.equity ?? status.equity`, `unrealizedPnl = wsTick?.unrealized_pnl ?? status.unrealized_pnl`, and `total_trades = wsTick?.total_trades ?? status.total_trades`, so the cells refresh as fast as the engine ticks. - Header gets a `WS LIVE` / `WS CONNECTING` / `WS OFFLINE` indicator driven by `ws.status` — the user can see at a glance whether the socket is healthy without inspecting the network panel. ui/src/test/hooks/useWebSocket.test.ts (new): - 6 vitest cases on a `MockSocket` stand-in: * Connects on mount when enabled * Skips when `enabled=false` * Parses incoming JSON, exposes `data`, sets `status` to 'open', tracks `lastSeq` * Drops `{type: ping}` envelopes (no `data` mutation) but still updates lastSeq when a real tick follows * Appends `?since=<lastSeq>` to the URL on reconnect after an abnormal close * Closes the socket on unmount 133/133 vitest, vite build clean, ruff clean. WAVE_STATUS: D-4.3-websocket → still 🟡 (paper UI bound; hook migration to drop polling and Live page binding remain). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LiveMonitor.tsx now opens a WebSocket to `/ws/live/{sessionId}` per selected session. WS-emitted fills merge into the polled fills list (deduped by order_id + ts) so the fills tape updates as soon as LiveExecutionContext._handle_fill broadcasts, instead of waiting on the next 2s poll. ui/src/pages/LiveMonitor.tsx: - New `useWebSocket<LiveWsFill>` subscription gated on `enabled: !!activeId` so the hook idles when no session is selected. - `wsFills` accumulates fills from incoming `{type: "fill"}` envelopes; reset to empty when the user switches sessions. - Display fills = polled-fills ∪ ws-fills, deduped by composite `order_id|ts` key, sorted by ts. Polled fills win on first sight, WS fills append; either source filling first is fine. - Header gets the same `WS LIVE` / `WS CONNECTING` / `WS OFFLINE` indicator that PaperTrading.tsx grew in slice 3, gated on `activeId` (no indicator when no session yet). Tests: - No new tests this commit — the existing useWebSocket vitest covers the hook surface, and LiveMonitor is a thin consumer. - 133/133 vitest still green; vite build + ruff clean. WAVE_STATUS: D-4.3-websocket → still 🟡 (both pages bound; only remaining work is dropping the polling hooks entirely once the WS streams are confirmed solid in production — that's a confidence move, not a code change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Refines D-6.1-unified attribution. Foundation slice split closed-trade PnL evenly across strategies that touched the same market — wrong when one strategy opened and another closed. This commit threads the closing fill's order_id (which the engine tags with the strategy name) into the closed-trade dict so PnL goes to the actual authoring strategy. flint/execution/backtest_context.py: - Both close paths in `_apply_fill` (full close + flip, partial close) now write `exit_order_id: fill.order_id` into the closed-trade record. Liquidations don't get an order_id (no triggering fill) so they fall through to the even-split path. flint/portfolio/shared_engine.py: - `SharedCapitalPortfolioEngine.run()` per-trade attribution checks `trade["exit_order_id"]` for a `strategy_name:` prefix; when found, the entire PnL goes to that strategy. Even-split fallback preserved for liquidations and untagged trades. Tests: - tests/test_shared_capital_portfolio.py — new `TestPnlAttribution::test_pnl_attributed_to_closing_strategy`: one strategy opens long, another closes short; verify the closer's PnL share exceeds 50% of total (foundation slice would have split equally, ~50% each, so this test exercises the new path). - 9/9 tests in the file pass; ruff clean. WAVE_STATUS: D-6.1-unified attribution refined. Remaining D-6.1 work (per-strategy capital caps via tagged sub-buckets, dollar- neutral rebalancing) still deferred to a follow-on PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two test fixes uncovered by the wide pytest sweep at the end of this session. tests/test_mcp_server.py: - TestRunBacktest.test_returns_valid_json was patching `flint.api.routes.backtest._build_strategy` and `BacktestEngine`, but the D-4.7-full MCP refactor routes the tool through `flint.services.backtest.run_backtest_sync` directly. Updated to mock at the service boundary with a fake tearsheet-shaped dict (metrics + winning_trades + losing_trades + total_fees + engine_used). Also patches `_get_store` so the auto-fetch path short-circuits on a "data already local" branch. - test_unknown_strategy switches to mocking the service to raise ValueError, which the MCP tool catches and surfaces as `error`. tests/test_portfolio.py: - TestPortfolioEngine.test_two_strategies asserted `combined_equity == 60`. The D-1.1.b force-close fix can append a terminal equity point past the last candle when a strategy had an open position at engine exit, so the combined curve length is in {N, N+1}. Relaxed the assertion to accept either. Sweep status: pre-existing failures from missing optional deps (`eth_account` for hyperliquid_live, `solders` for wallet) remain — those are env issues, not code regressions. ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the Wave 3 working session. Pytest sweep over the full suite (skipping the four files with missing optional deps: ccxt, eth_account, solders) reports: 2038 passed · 7 skipped · 0 failed in 5m39s Up from 1861 at the start of the session — net +177 tests landed across the new managers, replay, snapshots, websocket layer, shared-capital portfolio engine, and orchestrator. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reflects 28 commits on the restructure branch. All six phases now have concrete shipped state recorded next to them, and the "Recently done" section is repaired to call out the actual landed work (Waves 1, 2, 3 portfolio/UX, Wave 5 replay) instead of the old phase-7 + Hyperliquid-funding bullets. Adds a pointer to WAVE_STATUS.md for wave-by-wave detail; ROADMAP keeps the bird's-eye view, WAVE_STATUS owns the per-item state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

flint/api/routes/replay.py — three GETs over the portfolio event log + snapshot store + replay primitive: GET /api/v1/replay/{session_id}/events?since=<seq>&limit=<n> Page through events in seq order. limit defaults to 200, capped at 5000. has_more flag for client-side pagination. GET /api/v1/replay/{session_id}/state?target_ts=<t>&initial_capital=<c>&use_snapshot=<bool> Run `replay()` and return BookState as JSON. Positions flatten to "venue|market" keys. GET /api/v1/replay/{session_id}/summary Cheap polling target: event_count, latest_seq, snapshot_count. Registered under `/api/v1/replay`. Read-only — writes happen in the engine via the slice-4 writer hooks. Tests: - tests/test_replay_api.py — 10 cases: * Empty/populated events list, seq ordering, since pagination, limit + has_more, per-session isolation * State replay round-trip (open+close), partial intermediate-ts replay, unknown-session-returns-initial * Summary on empty + populated session (with snapshot) - Test fixture swaps a fresh `FlintStore` onto `app.state` per test to bypass the FastAPI module-level app singleton's leftover state from earlier tests. ruff clean. WAVE_STATUS: D-6.4-replay slice 5 backend shipped. Frontend time-travel page binding remains deferred to a later slice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-6.4-replay slice 5 MCP layer. After the REST endpoints landed, this commit exposes the same surface to MCP-compatible clients (Claude / Cursor / etc.) so AI workflows can drive time-travel queries against any session. flint/mcp_server.py: - `replay_summary(session_id)` — quick metadata: event count, latest seq, snapshot count. - `replay_state(session_id, target_ts, initial_capital=10_000)` — fold the log up to target_ts and return cash, realized PnL, fees, funding, borrow, fill counts, positions dict. - `list_replay_events(session_id, since=-1, limit=50)` — page through events in seq order. Caps limit at 5000 to bound the response size. All three call into the same `EventLogReader` + `replay()` + `SnapshotStore` services the REST endpoints use; no duplication. Tests: - tests/test_mcp_replay_tools.py — 7 cases: * Each tool on an empty session returns the right empty shape * `replay_summary` reflects event count + latest seq after writes * `replay_state` round-trips an open+close to realized_pnl == 10 * `list_replay_events` honors since + limit (with has_more flag) - 47/47 across all three MCP test files (test_mcp_server, test_mcp_standalone, test_mcp_replay_tools). ruff clean. Fixture trick: monkeypatches `flint.mcp_server._store` directly with a fresh `FlintStore(tmp_path)` so the cached singleton lookup in `_get_store()` returns the per-test DB instead of trying to reconstruct via `load_config()`. WAVE_STATUS: D-6.4-replay slices 1-5 (backend + REST + MCP) all shipped. Only frontend time-travel page binding remains deferred. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes D-6.4-replay end-to-end. Backend + REST + MCP shipped in prior commits; this slice adds the UI surface. ui/src/hooks/useReplay.ts: - `useReplaySummary(sessionId)` polls `/api/v1/replay/{id}/summary` every 5s for event_count / latest_seq / snapshot_count. - `useReplayState(sessionId, targetTs, initialCapital)` fires once per (session, ts) change against `/api/v1/replay/{id}/state`, exposes `{data, error, loading}`. Replay isn't a stream — it's a deliberate query, so no auto-retry. ui/src/pages/Replay.tsx: - Session-id loader (text input + LOAD button) — backtest run_ids and paper session_ids both work. - Summary cards (event count / latest seq / snapshot count). - Target-ts scrubber with epoch-second input + initial-capital input. Defaults to "now" once a session lands so the page shows current state on first render. - State cards: cash (gain/loss accent vs initial), realized PnL, fill count, liquidation count, fees, funding. - Open-positions table (venue/market, side, size, entry). - Empty-state placeholder when no session loaded. - Inline error banner on either summary or state fetch failure. ui/src/App.tsx: - New `/replay` route + `REPLAY` nav item (key 0). 133/133 vitest still green; vite build clean; ruff clean. WAVE_STATUS: D-6.4-replay → 🟢 (all 5 slices shipped end-to-end). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ts refresh Reflects D-6.4-replay end-to-end shipping + D-4.3-websocket per-session streams. README.md: - Paper Trading section: live PnL bullet adds "+ per-session WebSocket stream (`/ws/paper/{id}`)" so readers know polling is no longer the only path. New bullet on time-travel replay (event log + REST API + Replay UI page). - MCP Server section: tool count 17 → 20; explicitly names the three new replay tools (`replay_summary`, `replay_state`, `list_replay_events`). - Auto-counts block refreshed via `scripts/update_readme_counts.py`: 23 strategies, 26 providers, 80 REST endpoints, 20 MCP tools, 2055 tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Brings the AI-aware context doc up to date with the work this session. Future agents (and humans skimming) get a one-page map of: - The seven managers BacktestContext composes (PositionManager, CashManager, FillRecorder, OrderQueue, FundingLedger, BorrowLedger, MarketDataFeed) and what each owns. - The PortfolioMarginEngine pre-trade gauntlet. - The event-sourcing modules (event_log, replay, snapshots) plus the REST + MCP + UI surfaces that wrap them. - The flint/services/* layer and the rule that strategy templates live in services/strategies.py only (single source of truth). Two new rules in the Rules section: - New BacktestContext mutations route through the seven managers. - New strategy templates land in services/strategies.py only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The session landed enough new public surface (event log + replay + WebSocket streams + portfolio orchestrator + manager-decomposed BacktestContext + service layer + per-session WS routes) that a patch bump understates the changeset. Goes to 1.4.0. pyproject.toml: 1.3.1 → 1.4.0. The capabilities endpoint reads this directly so no other version pins need updating. docs/concepts/architecture.md: - New "BacktestContext composes seven managers" subsection mapping each manager (PositionManager, CashManager, FillRecorder, OrderQueue, FundingLedger, BorrowLedger, MarketDataFeed) to what it owns. Notes the PortfolioMarginEngine pre-trade gauntlet and the component-tagged rejection log line. - New "Event sourcing + replay" subsection covering the portfolio_events table, the fold/replay primitive, snapshot fast-forward, and the three surfaces (REST + MCP + UI page) that expose it. Calls out the load-bearing parity test that pins byte-for-byte cash equality. UI docs page regenerated via `scripts/build_docs.py` so the architecture section in the browser stays in sync. Verified `_get_version()` returns 1.4.0; ruff clean; UI vite build clean; replay-related test files (event log + replay + snapshots + engine hooks + REST + MCP) all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Full pull-request narrative covering all 6 phases, the 7-manager BacktestContext decomposition, D-4.3-websocket end-to-end, D-6.4-replay all 5 slices, the Rust ports (TxCostModel + OrderbookFiller), and the service-layer extraction. Lists migration notes (no breaking API changes; new ctor kwargs are opt-in) and the follow-on items still blocked on testnet secrets. Stays out of git history once the PR merges — this is a human-readable summary for the merge review, not a runtime artifact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-6.4-replay slice 6. Closes the snapshot loop. Before this commit, SnapshotStore was a write-only API — callers had to manually invoke `SnapshotStore.write()`. Without snapshots, large sessions force a full fold-from-seq=0 on every replay query (O(N events)). Adds optional `snapshot_store` + `snapshot_every` (default 10_000) ctor kwargs to BacktestContext. When `snapshot_store` is wired: * `_emit()` increments a `_events_since_snapshot` counter on every appended event. * When the counter crosses `snapshot_every`, `_compact_snapshot()` folds the entire log into a fresh BookState (using snapshot fast-forward when a prior snapshot exists, so each compaction is O(events_since_last_snapshot) rather than O(N events)). * Counter resets to 0 after each compaction. * Failures are swallowed at debug level — engine keeps running. Backward compatible: legacy callers without `snapshot_store` see zero new behavior. The slice-3 SnapshotStore manual-write API stays intact for callers that want explicit control. Tests: - tests/test_auto_compaction.py — 5 cases: * No snapshot when store unset (zero-overhead path) * Snapshot after exactly N events at the threshold * Multiple snapshots at each N-event boundary * Counter resets after compaction (sub-threshold burst doesn't double-fire) * **Load-bearing**: replay with snapshot fast-forward matches replay-from-zero on the same target_ts after compaction — compaction never produces a divergent snapshot 5/5 pass; ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-6.4-replay E2E. Stronger production-confidence signal than the synthetic-event tests in test_event_log + test_replay: drives a real `MACrossoverStrategy` against a sine-wave-with-drift price series for 300 bars, captures the event log + auto-compacted snapshots in flight, then asserts replay reproduces the live context's `cash` + `total_fees` byte-for-byte. tests/test_replay_e2e_backtest.py — three classes: - `TestEndToEndReplayMatchesLive::test_replay_reproduces_account_cash` Real strategy run → replay-from-zero → cash equality (1e-9). - `TestAutoCompactionInRealBacktest::test_compaction_during_backtest_preserves_correctness` Same scenario with `snapshot_every=20` so the auto-compactor fires repeatedly during the run. Replay-with-snapshot vs replay-without-snapshot must agree on cash, realized PnL, fill count, and every position's (side, size, entry_price). - `TestReplayAtIntermediateTimestamps::test_intermediate_ts_replay_traces_equity_curve` Replay at 25% / 50% / 75% / 100% of the candle range; assert monotonic fill_count and finite cash. Helper: `_drive_strategy(ctx, strategy, candles)` mimics the bar loop BacktestEngine runs internally. Direct-loop is necessary because BacktestEngine.run() routes through Rust when capabilities allow, which bypasses the supplied event-log-wired BacktestContext. 3/3 pass; ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

D-6.4-replay UI slice 6. Promotes the bare ts-input to a real time-travel UX. ui/src/hooks/useReplay.ts: - New `useReplayEvents(sessionId, since, limit)` wraps `/api/v1/replay/{id}/events`. One-shot fetch (re-fires on hook arg change) so the page can keep a 1000-event window in memory for the slider's range + the tail panel's render list. - New `ReplayEvent` + `ReplayEventsPage` types. ui/src/pages/Replay.tsx: - Numeric ts-input replaced with an `<input type="range">` slider bounded to [first event ts, last event ts]. Background is the amber accent so the position-along-the-history is obvious at a glance. - Range labels show start ts, end ts, and the "N events in window" count. - Step controls: ← PREV EVENT / NEXT EVENT → walk one event ts at a time; ⏮ START / END ⏭ jump to the boundaries. Wired against the events array so steps land on actual event timestamps, not arbitrary epoch values. - New "EVENT.TAIL" panel under the positions table: shows the most recent 50 events with `ts <= target_ts`, color-coded by kind (gain for fills, loss for liquidations, amber for order.*, ghost for everything else). Header shows "showing N of M folded" so the user knows exactly how much state went into the displayed BookState. 133/133 vitest still green; vite build clean; ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the four-part Wave 5 wrap-up. After ports of TxCostModel and OrderbookFiller landed in earlier commits, this slice ports the two remaining hot-path ledgers — funding rates and Jupiter borrow rates — to Rust with PyO3 bindings. rust/src/engine/funding_ledger.rs: - `FundingLedger` owns per-market and per-venue history. `add`, `latest`, `recent(market, lookback)`, `by_venue(market, lookback)` mirror the Python class one-for-one. 4 cargo tests. rust/src/engine/borrow_ledger.rs: - `BorrowLedger` owns `_history` per market + per-trade `payments` ledger + `total_paid` counter. `record`, `record_payment`, `add_paid`, `latest`, `recent(market, lookback)`, `cumulative_at(market, ts)` (linear walk early-exits at first point past `ts`, matching Python). 3 cargo tests. PyO3 bindings: - `flint_core.FundingLedger` exposes `add(market, venue, ts, rate)` / `latest` / `recent(market, lookback=24)` / `by_venue(market, lookback=24)`. - `flint_core.BorrowLedger` exposes `record` / `record_payment` / `add_paid` / `total_paid` (getter) / `latest` / `recent` / `cumulative_at` / `payments_count`. Tests: - tests/test_rust_ledger_parity.py — 7 cases pinning every method's output to 1e-9 against the canonical Python implementations: * Funding: latest, recent pairs, by_venue groupings, unknown-market empty fallback * Borrow: latest + recent, cumulative_at across the whole query-ts window (before-all, exact-match, mid-range, past-end), total_paid + payments-count round-trip 7/7 parity green; 7/7 cargo green; 91/91 across the whole replay suite (event log + replay + snapshots + engine hooks + REST + MCP + auto-compaction + E2E + ledger parity). ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…to-compaction + E2E + UI polish) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ers, UI polish Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wave 5 closeout records the four follow-on items after D-6.4-replay: auto-compaction, E2E parity over real strategy, replay UI polish, Rust ledger ports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sohan-shingade and others added 30 commits April 24, 2026 11:57

sohan-shingade and others added 26 commits April 25, 2026 00:38

docs: refresh README counts (2055 → 2070 tests after Rust ledger + au…

525cf4a

…to-compaction + E2E + UI polish) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs: PR_DESCRIPTION refresh — auto-compaction, E2E parity, Rust ledg…

24ed728

…ers, UI polish Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sohan-shingade merged commit ad687c2 into main Apr 25, 2026

sohan-shingade deleted the restructure branch April 25, 2026 19:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restructure: Trust → Structure → Depth → Polish → CI → Portfolio (v1.4.0)#33

Restructure: Trust → Structure → Depth → Polish → CI → Portfolio (v1.4.0)#33
sohan-shingade merged 56 commits intomainfrom
restructure

sohan-shingade commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sohan-shingade commented Apr 25, 2026

Restructure: Trust → Structure → Depth → Polish → CI → Portfolio

Highlights by phase

Phase 1 — Trust & correctness (shipped)

Phase 2 — Structural cleanup (shipped, full close)

Phase 3 — Depth on wedge (shipped)

Phase 4 — Product polish (shipped)

Phase 5 — CI (shipped)

Phase 6 — Portfolio (foundations shipped)

Test sweep

Files reorganized

Migration notes for users

Follow-on work

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant