A low-latency systems engineering reference applied to cryptocurrency markets. Demonstrates multi-regime strategy routing, lock-free config hot-swap, and SIMD-accelerated state estimation under microsecond constraints.
Built in 30 days as an architectural exercise. Open-sourced for educational and reference purposes.
This repository is a systems engineering reference, not a deployable trading strategy. Strategy parameters (thresholds, regime classifier weights, risk limits) are intentionally omitted from this public release.
Retail HFT on public cloud infrastructure faces a fundamental physical barrier: institutional market makers operate in colocated facilities with sub-millisecond latency to exchange matching engines, while public cloud deployments incur ~5ms cross-datacenter latency regardless of software optimization. This project's value lies in its architecture, not its P&L.
The entire engine runs on four dedicated CPU cores with zero mutex overhead on the hot path.
| Core | Thread | Role |
|---|---|---|
| 0 | Config Watchdog | inotify file watch → atomic<Config*> swap (RCU pattern) |
| 1 | Market Thread | Binance WebSocket bookTicker + aggTrade ingestion |
| 2 | Trade Thread | Order fill / cancel WebSocket ingestion |
| 3 | Engine Loop | Tick → regime detection → strategy → order management |
-
Lock-Free Config Hot-Swap :
inotifydetectsconfig.jsonchanges;g_active_config.exchange(new_config)replaces the entire config in a single atomic operation. Full parameter tuning without bot restart — no mutex on the hot path (poor man's RCU). -
Zero-Allocation Hot Path : Order IDs generated directly into stack buffers via
std::to_chars. Prefix matching uses 4-byte integer comparison (0x5f746e65 == "ent_") instead ofstrcmp. Order response parsing viasimdjsonSIMD JSON parser. -
Cache-Line Padding :
alignas(64)on shared atomics to prevent L1 false sharing across cores.
-
Hardware Timestamping :
__rdtscintrinsic for sub-nanosecond latency measurement, cross-calibrated againstCLOCK_MONOTONIC_RAW. -
Core Pinning : Each thread bound to a dedicated physical core via
pthread_setaffinity_np, isolating the hot path from OS context switching.
Three structurally distinct engines demonstrate different mathematical approaches to market microstructure. The regime classifier routes ticks to the appropriate engine every 60 seconds based on volatility, taker delta, and OU β coefficient.
-
RollingZEngine (CHOPPY regime) — Triple-filter mean reversion: rolling Z-score threshold + EMA OU coefficient gate + Order Flow Imbalance direction confirmation.
RollingZScoreuses O(1) incrementalsum/sum_squpdates with periodic full recalculation to reset floating-point drift. -
KinematicEngine (TRENDING regime) — Price modeled as a physical system
[position, velocity, acceleration].PhysicsStateupdated via AVX2_mm256_fmadd_pdKalman step executed in a single 256-bit register. Jerk condition (rate of change of acceleration) gates entry to accelerating trends only. -
HawkesEngine (TOXIC regime) — Self-exciting Hawkes Process models event clustering.
hawkes_energyincrements byalphaper event and decays asexp(-beta*dt). Energy threshold breach triggers OBI-directional entry to capture post-shock aftershocks.
-
6-State FSM :
NONE → PENDING_ENTRY → LONG/SHORT → PENDING_EXIT → PENDING_EMERGENCY. Maker chase, two-tier stop loss (maker chase → market order fallback), trailing stop. -
Ghost Fix : REST API resync triggered on WebSocket silence detection to recover from missed order events.
-
Graceful Shutdown : First SIGINT blocks new entries and attempts maker exit. Market order forced after 10s timeout. Second SIGINT triggers immediate termination.
- Hardware-level profiling and timing primitives (
__rdtsc, core pinning, cache-line padding) - Lock-free concurrency across multiple threads with atomic RCU patterns
- SIMD vectorization of non-trivial math (Kalman state update in a single AVX2 FMA instruction)
- Zero-allocation hot paths with compile-time optimized string handling
- Multi-strategy routing driven by online market regime classification
- State machine design for asynchronous order lifecycle under network failures
- C++17 or later compiler (GCC / Clang)
- simdjson — AVX2-accelerated JSON parser
- OpenSSL — WebSocket TLS + HMAC signing
g++ -O3 -march=native -std=c++17 -pthread main.cpp simdjson.cpp -o delta_hft -lssl -lcryptoMIT License — see LICENSE file.
- Delta Cast — Kernel-level virtual ASIO driver
- Delta Engine — Custom D3D11 game engine (USPTO Patent Pending #19/641,687)