This repository contains tools and notebooks for detecting spoofing behaviour in high-frequency trading markets. The pipeline trains an XGBoost classifier on simulated orderbook data and applies it to real Euronext data replayed through the Atom execution engine.
spoofing lib ──► simulated order flow ──► XGBoost training ──┐
▼
Euronext data ──► data_prep ──► Atom replay ──► model inference
- Simulation — the
spoofinglibrary generates synthetic orderbook sessions with labelled spoofer agents (is_spoofertarget) - Training —
HFT Training Algo.ipynbtrains an XGBoost model on the simulated data using orderbook features (spread, price levels, volume imbalance, order rank, …) - Preparation —
data_prepformats raw Euronext market data into the Atom order file format - Replay — Atom replays the formatted Euronext data through its orderbook engine
- Inference — the trained model is applied to the replayed Euronext data to detect spoofing
- JupyterLab with R kernel and language server
- Python (pandas, numpy, scikit-learn, xgboost)
- Java 17 + Maven
- R (tidyverse, ggplot2, dplyr, rJava)
Repositories cloned automatically at startup:
relmaazouz/atom— Atom orderbook execution enginerelmaazouz/data_prep— Euronext → Atom format conversionrelmaazouz/spoofing— Spoofing simulation libraryrelmaazouz/ml— ML notebooks (XGBoost training and inference)
docker compose up --buildThe first build takes several minutes. Subsequent starts are fast.
On startup, the atom and data_prep repositories are automatically cloned into ./notebooks/.
Open your browser at:
http://localhost:8888
Enter the token: changeme
To change the token, edit the
JUPYTER_TOKENvariable indocker-compose.yml.
Open a terminal in JupyterLab (File > New > Terminal) and run:
cd ~/work/atom
mvn -f pom.xml packagedocker compose downAll commands below are run from a JupyterLab terminal (File > New > Terminal) inside ~/work/atom/dev/.
cd ~/work/atom/devGenerate a simulation with ZIT agents and redirect the output to a file:
java -cp target/atom-1.15.jar fr.cristal.smac.atom.Generate <nbAgents> <nbOrderbooks> <nbTurns> <nbDays> > myfileExample — 10 agents, 1 orderbook, 1000 ticks, 1 day:
java -cp target/atom-1.15.jar fr.cristal.smac.atom.Generate 10 1 1000 1 > myfileExtract price and agent data for analysis:
# Price series
grep '^Price' myfile > prices.csv
# Specific agent
grep '^Agent' myfile | grep ZIT1 > agent.csvPlot prices in R (JupyterLab R notebook):
prices <- read.csv(file='prices.csv', sep=";", header=TRUE)
plot(prices$price, type='l', col='red')Replay an existing order file through the Atom engine:
java -cp target/atom-1.15.jar fr.cristal.smac.atom.Replay <dataFile>Example with a bundled sample file:
java -cp target/atom-1.15.jar fr.cristal.smac.atom.Replay ../dist/data/orderFileExample1To verify consistency between a generated file and its replay (virtuous loop):
java -cp target/atom-1.15.jar fr.cristal.smac.atom.Generate 10 1 1000 1 > myfile1
java -cp target/atom-1.15.jar fr.cristal.smac.atom.Replay myfile1 > myfile2
diff myfile1 myfile2Order lines follow the syntax (;-separated):
Order;<orderbook>;<agent>;<id>;<type>;<direction>;<price>;<qty>
| Type | Description |
|---|---|
L |
Limit order |
M |
Market order |
I |
Iceberg order |
C |
Cancel order |
U |
Update order (quantity) |
Direction: B (Bid) / A (Ask) |
Market commands (lines starting with !):
| Command | Effect |
|---|---|
!P |
Print orderbook state |
!C |
Switch to continuous fixing |
!F |
Switch to fix fixing |
!K |
Close market and reinit |