This repository demonstrates applied machine learning on time-series anomaly detection, using the Numenta Anomaly Benchmark (NAB) dataset. The pipeline features transformer-based classification and Monte Carlo Dropout uncertainty quantification.
- Provide a safe, reproducible environment for modelling defence/security-relevant event sequences without sensitive data.
- Demonstrate end-to-end ML pipeline skills: data loading, feature engineering, model training, evaluation, and uncertainty quantification.
- Implement modern ML techniques: transformers for time-series and Bayesian uncertainty estimation via MC Dropout.
- Benchmark performance on a recognised industry dataset: NAB.
- NAB Data Loading: Utilities to load and preprocess Numenta Anomaly Benchmark datasets
- Sliding Window Processing: Convert time-series to windowed sequences for classification
- Time-Series Feature Extraction: Statistical features (mean, std, slope, skewness, kurtosis, etc.)
- Transformer Architecture: Attention-based model for temporal pattern recognition
- MC Dropout Uncertainty: Monte Carlo Dropout for uncertainty quantification at inference
- Calibration Analysis: Metrics and visualisations for uncertainty quality assessment
- Python 3.8+
- Git
-
Clone the repository:
git clone https://github.com/your-username/Chrono-Sentinel.git cd Chrono-Sentinel -
Clone the NAB repository (for data):
git clone https://github.com/numenta/NAB.git NAB_temp
-
Install dependencies:
pip install -r requirements.txt pip install -e .
Train the model:
python scripts/train.pyEvaluate with MC Dropout uncertainty:
python scripts/evaluate.pyOr explore interactively via notebooks:
pip install jupyter
jupyter notebook notebooks/python scripts/train.py --epochs 50 --batch-size 32 --lr 1e-3Key arguments:
--epochs: Maximum training epochs (default: 50)--batch-size: Batch size (default: 32)--window-size: Sliding window size (default: 50)--d-model: Transformer embedding dimension (default: 64)--dropout: Dropout probability for MC Dropout (default: 0.2)--patience: Early stopping patience (default: 10)
python scripts/evaluate.py --model-path outputs/best_model.pt --mc-samples 30Key arguments:
--model-path: Path to trained model checkpoint--mc-samples: Number of MC Dropout forward passes (default: 30)--threshold: Classification threshold (default: 0.5)
Training and evaluation produce:
outputs/best_model.pt- Trained model checkpointoutputs/training_history.json- Training metricsoutputs/training_history.png- Loss curvesoutputs/evaluation_metrics.json- Test metricsoutputs/roc_curve.png- ROC curveoutputs/calibration_curve.png- Reliability diagramoutputs/uncertainty_histogram.png- Uncertainty analysisoutputs/predictions_timeline.png- Predictions with confidence bands
Chrono-Sentinel/
├── README.md
├── requirements.txt
├── setup.py
├── threatsim/
│ ├── __init__.py
│ ├── data.py # NAB data loading, windowing, DataLoaders
│ ├── features.py # Statistical feature extraction
│ ├── models.py # Transformer with MC Dropout
│ └── utils.py # Training utilities, visualisation helpers
├── scripts/
│ ├── train.py # Training pipeline
│ └── evaluate.py # MC Dropout evaluation
├── notebooks/
│ ├── 01_generate_data.ipynb # Data exploration
│ ├── 02_feature_engineering.ipynb # Feature analysis
│ └── 03_model_training.ipynb # Interactive training
├── NAB_temp/ # Cloned NAB repository (data source)
└── outputs/ # Training outputs (created at runtime)
The transformer architecture is designed for time-series anomaly classification:
Input Window (batch, seq_len)
↓
Linear Projection → (batch, seq_len, d_model=64)
↓
Sinusoidal Positional Encoding
↓
2× Transformer Encoder Layers (4 heads, dropout=0.2)
↓
Mean Pooling
↓
Classification Head (Linear → Dropout → Linear → Sigmoid)
↓
Output: Anomaly Probability
MC Dropout Uncertainty: Dropout remains active during inference. Multiple forward passes produce a distribution of predictions; the mean gives the estimate and standard deviation gives uncertainty.
MC Dropout treats dropout as approximate Bayesian inference:
- Keep dropout enabled at inference time
- Run N forward passes (default: 30)
- Mean of predictions → point estimate
- Std of predictions → epistemic uncertainty
This allows the model to express "I don't know" through higher uncertainty, particularly valuable for anomaly detection where novel patterns may not match training data.
We evaluate uncertainty quality via:
- Expected Calibration Error (ECE): Measures prediction confidence vs actual accuracy
- Uncertainty-Error Correlation: Higher uncertainty should correlate with errors
- Prediction Interval Coverage: Do confidence intervals contain true values?
This project is for educational and portfolio demonstration purposes.