ARENA — ADS-B Receiver Evaluation Engine

ARENA is a statistical evaluation engine that determines whether ADS-B receiver hardware changes actually improved performance — or whether observed differences are just noise.

The core difficulty is that observed improvements are easily confounded: traffic varies by time-of-day and season, observation conditions are non-stationary, and metrics become unstable under sparse or bursty data. Simple before/after comparisons cannot separate real gains from these confounders. ARENA is designed to make that separation explicit through multiple complementary statistical methods (Bayesian NB-GLM with NumPyro/NUTS, frequentist NB-GLM, Mann-Whitney U, change-point detection, OpenSky-normalized capture ratios, distance-band analysis) and draws conclusions from convergence or divergence across models — no single method is treated as authoritative.

System Overview

Raspberry Pi (edge)                WSL2 / Linux (analysis)
┌──────────────────┐               ┌────────────────────────────┐
│  readsb → PLAO   │  rsync/pull   │                            │
│       → adsb-eval│──────────────>│  pipeline (8 stages,       │
└──────────────────┘               │           wave-parallel)   │
                                   │      │                     │
                                   │      ├──> /output          │
                                   │      │    (PNGs, reports)  │
                                   │      │                     │
                                   │  artifact run              │
                                   │      │                     │
                                   │      └──> /output/payload  │
                                   │           (CSVs, bundles)  │
                                   └────────────┬───────────────┘
                                                │
                                   CSVs (~55 files) + prompt
                                                │
                                                ▼
                                   ┌────────────────────────────┐
                                   │  Multiple LLMs             │
                                   │  → structured JSON claims  │
                                   │  → raw/{ai_name}/          │
                                   │    YYYYMMDD.json           │
                                   └────────────┬───────────────┘
                                                │
                                                ▼
                                   ┌────────────────────────────┐
                                   │  synthesis                 │
                                   │  ingest → triage →         │
                                   │  proposition review        │
                                   │  (SQLite, two-layer DB)    │
                                   └────────────────────────────┘

Three subsystems:

Pipeline — 8-stage orchestration with wave-parallel scheduling, failure-resilient execution, and append-only audit logging. Outputs human-readable graphs and reports to /output.
Artifacts — Converts pipeline outputs into verifiable, LLM-ready evidence bundles. Content identity (SHA-256), schema validation, and provenance/lineage ensure that downstream analysis operates on auditable evidence, not implicit assumptions. Integrity verification carries through to synthesis ingestion.
Synthesis — Cross-model claim ingestion from multiple LLMs, enrichment, baseline clustering, proposition mapping, automated triage, and human review queue. Two-layer DB design (proposition + claim layers with convergence judgments). SQLite-backed, path-isolated.

Design Philosophy

ARENA treats LLMs as hypothesis generators, not truth sources — claims are validated through structured evidence and cross-model convergence. The full catalogue of 31 engineering decisions is in docs/principles/engineering-decisions.md.

Quick Start

python -m venv .venv && source .venv/bin/activate
pip install -U pip && pip install -e .[dev]
python -m arena.cli validate
pytest

Key Commands

arena validate                                    # check settings/paths
arena run --only 1 --dry-run --no-gpu --skip-plao # pipeline dry run
arena artifacts verify <bundle>                   # verify artifact bundle
arena synthesis run --path sample_data/synthesis/raw --db ./tmp/s.db \
  --enriched-dir ./tmp/enriched --review-dir ./tmp/review \
  --raw-original-dir ./tmp/orig --raw-repaired-dir ./tmp/repaired \
  --repair-log-dir ./tmp/logs                     # synthesis smoke run

Docker

docker compose -f docker/docker-compose.yml run --rm arena-tests
docker compose -f docker/docker-compose.yml run --rm arena-validate
docker compose -f docker/docker-compose.yml run --rm arena-synthesis-smoke

Documentation

Detailed docs live in docs/. Start at docs/README.md.

Category	Key Documents
Operations	Architecture · Reproducibility · Synthesis
Design	Engineering Decisions · Artifact Design · AI-Assisted Analysis · AEME
Context	System Context · Statistical Assumptions

Tech Stack

Python 3.11+, NumPyro/JAX, PyMC, statsmodels, pandas, SQLite, Docker, GitHub Actions CI (lint + test matrix + coverage + smoke + Docker).

Notes

Secrets, credentials, and private data are excluded by design.
Use environment variables or CLI overrides for local paths.
Sample data under sample_data/ is synthetic and deterministic.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github/workflows		.github/workflows
arena		arena
docker		docker
docs		docs
real-data		real-data
real-output		real-output
real-settings		real-settings
sample_data		sample_data
scripts		scripts
src/arena		src/arena
tests		tests
workspace/synthesis/raw		workspace/synthesis/raw
.coveragerc		.coveragerc
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARENA — ADS-B Receiver Evaluation Engine

System Overview

Design Philosophy

Quick Start

Key Commands

Docker

Documentation

Tech Stack

Notes

About

Uh oh!

Releases 15

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ARENA — ADS-B Receiver Evaluation Engine

System Overview

Design Philosophy

Quick Start

Key Commands

Docker

Documentation

Tech Stack

Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages