Skip to content

XOREngine/xand-ecg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

XAND-ECG — Lead-wise ECG Classification

Lead-wise 1D CNN · ~3M Params · 100 Hz · 4 Diseases · XAI Validated

License Python CUDA

12-lead ECG classification with quantitative explainability validation against clinical fiducial points.
Designed for research-grade experiments under distribution shift. Not intended for clinical use.

XAND-ECG trains per-disease binary classifiers on PTB-XL using a lead-wise 1D CNN, covering 4 diagnostic families with a strict pure-label policy. Attribution maps are validated against PTB-XL+ fiducial-derived clinical masks.


📊 Results & Evaluation Protocol

MI — Myocardial Infarction | STTC — ST/T Changes | CD — Conduction Disturbance | HYP — Hypertrophy

Reported metrics:

  • Val AUC — Best validation AUC during training (checkpoint selection criterion)
  • Test† — Test AUC at the epoch where validation AUC was best → the selected model
Disease Val AUC Test†
MI — Myocardial Infarction 0.9723 0.9700
STTC — ST/T Changes 0.9324 0.9332
CD — Conduction Disturbance 0.9292 0.9175
HYP — Hypertrophy 0.8710 0.8846

= test metric at the epoch of best validation AUC (selected model)

  • Pure labels only: conf = 100 → positive, label absent → negative, conf = 0–99 → excluded
  • All splits are patient-level (no patient appears in more than one split)

External Validation

The selected PTB-XL checkpoints were evaluated zero-shot on two external ECG datasets with no fine-tuning:

  • Chapman-Shaoxing — 45K ECGs, Shaoxing + Ningbo hospitals, China
  • Georgia — 10K ECGs, Emory University, Atlanta, USA
Disease PTB-XL Test Chapman Georgia
MI 0.9700 0.9527 —*
STTC 0.9332 0.8669 0.8336
CD 0.9175 0.8533 0.8609
HYP 0.8846 0.7660 0.6992

* Georgia MI excluded: insufficient positive samples (n=7). SNOMED CT mapping was conservative and defined a priori.

Morphology-based conditions (MI, CD) show the strongest transfer; voltage-dependent HYP shows the largest drop — consistent with known limitations of voltage-based criteria under device and population shift. See FINDINGS.md for full degradation analysis.


🔍 Explainability Maps

Visualizations follow the standard clinical 12-lead layout (3×4 grid + rhythm strip), with attribution and clinical reference integrated directly into each lead:

Element What it shows
CLIN strip Blue band marking the fiducial-derived clinical region from PTB-XL+. Approximate academic ground truth for visual comparison
ATTR strip Attribution heatmap below the trace — same color scale. Shows where the model concentrated attention along the full 10-second window
ECG trace Raw signal in mV. Trace color reflects attribution intensity: white → yellow → orange → red

Each lead is displayed with its own vertical scale — standard practice in digital ECG viewers when amplitudes differ substantially across leads.

Same ECG, Different Questions

The same ECG produces different attribution maps depending on which pathology head is queried. Each map answers a different clinical question.

⚠️ Interpreting attribution maps

  • Each heatmap answers one question: where does the model look to decide about this specific condition?
  • The map is not a comprehensive delineation of all abnormal regions.
  • A binary classifier may attend to the minimum sufficient evidence for the decision, not the full pathological extent.

Myocardial Infarction Myocardial Infarction

Conduction Disturbance Conduction Disturbance

Hypertrophy Hypertrophy

ST/T Changes ST/T Changes

Note: The visual examples above are research visualizations derived from publicly available ECG records. The original datasets are not redistributed and remain subject to their respective licenses and citation requirements.


📐 Quantitative XAI Validation

Attribution maps were evaluated against PTB-XL+ fiducial-derived clinical masks on truly positive ECGs using Integrated Gradients (primary method).

Disease n Pointing Game CAS
MI 245 0.9796 0.7519
STTC 367 —** 0.4034
CD 339 0.9676 0.5396
HYP 115 0.9652 0.7161

** STTC Pointing Game (0.11) is not informative for this condition: the model anchors attribution at the R-peak of V6 (~65ms before ST/T onset), reading the R→ST transition within its receptive field — a clinically coherent mechanism consistent with the LVH strain pattern. This interpretation was considered clinically plausible during clinical review.

IG and GradSHAP produce near-identical attribution maps (cosine similarity ~0.90–0.93). Cross-method analysis, surgical perturbation, and full validation details in FINDINGS.md. Clinical interpretation in Clinical Review.


🧠 Methodology & Model Design

  • Per-disease binary classifiers — independent model per pathology, no multilabel compromises. Each optimized for its own class balance and convergence dynamics
  • Lead-wise encoder with shared weights — each of the 12 leads processed independently through a shared 1D CNN encoder (4 ResBlocks, kernel=7), then concatenated (12 × 256 = 3072-dim). Per-lead attribution comes for free — no post-hoc decomposition needed. Explicit cross-lead mechanisms (attention, multiscale kernels) were tested and rejected: all degraded HYP and CD while adding parameters. The FC layers after concatenation capture inter-lead relationships implicitly
  • 100 Hz downsampling — PTB-XL provides 500 Hz, but 100 Hz retains sufficient diagnostic information for the target conditions. In this architecture, higher resolution increased parameters ~4× without improving any pathology — the bottleneck is sample count (12K ECGs), not temporal resolution
  • Per-sample global z-score normalization — a single μ/σ computed over all 12×1000 samples per ECG. This was the key change that restored global voltage structure relevant for HYP diagnosis and drove HYP AUC from ~0.78 to ~0.88. A subsequent per-lead normalization is kept as a deliberate trade-off: it improves CD (+0.018 AUC, morphology-based) at a minor cost to HYP (−0.004 AUC).
  • Pure label training — only explicit positives (conf = 100) and negatives (label absent); uncertain samples (conf = 0–99) excluded entirely. Uncertain labels would weaken XAI validation: attribution maps must point to real pathology, not annotator disagreement. Focal Loss (γ=2.0) handles class imbalance across 8–24% positive rates
  • Multitask regularization — auxiliary heads for Device (0.03) and Sex (0.01). Empirically, Device acted as a regularizer: it encouraged device-invariant representations, stabilizing HYP training (worst validation oscillation drops from −1003bp to −272bp). Sex provides orthogonal physiological signal (QRS duration, voltage amplitudes differ by sex), benefiting MI and STTC

Architecture ablations, normalization experiments, and auxiliary task analysis in FINDINGS.md.


⚙️ Training, Hardware

Core Training Configuration

Component Choice
Architecture Lead-wise 1D CNN (shared encoder)
Parameters ~3.0M
Input 12 leads × 1000 samples (100 Hz × 10s)
Loss Focal Loss (γ = 2.0)
Optimizer AdamW + Cosine decay
Aux tasks Device (0.03), Sex (0.01)
Label policy Pure labels only (conf=100 / absent)
Dataset PTB-XL + PTB-XL+
Tracking MLflow

Hardware Used

GPU Role
GTX 1650 4GB Primary training & reported results (seed=42)
RTX 4060 Ti 16GB Development & experimentation

Peak VRAM usage ~1.9 GB per disease.


🚀 Quick Start

git clone https://github.com/XOREngine/xand-ecg.git
cd xand-ecg
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

See QUICKSTART.md for data setup, training, XAI evaluation, and visualization commands.


📂 Documentation

Document Description
QUICKSTART.md Data setup, training, evaluation, and visualization commands
docs/FINDINGS.md Full experimental record — architecture ablations, normalization, XAI validation, lead robustness, surgical perturbation, external generalization
docs/XAND-ECG_v0.1___ClinicalReview.pdf Clinical review document — per-pathology assessment of learned mechanisms by cardiology

📚 References

Datasets

External Benchmarks & Related Work

Optimization & Methods


👥 Contributors

Development:

  • José Artusa (@WallyByte) — Project design and implementation.

Clinical Review:

  • Belén Biscotti (LinkedIn) — Conducted the Clinical Review and reviewed the clinical coherence of the attribution maps and findings.

📬 Contact

For questions, please reach out at info@xorengine.com



XAND-ECG — Lead-wise ECG Classification.
💪 If you extend it, audit it, or break it and improve it — go for it.

© 2026 XOREngine · Open Source Commitment

About

Lead-wise 1D CNN for 12-lead ECG classification with attribution-map validation against fiducial clinical regions from PTB-XL+.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages