Skip to content

Unified end-to-end ML reconstruction for particle physics

License

Notifications You must be signed in to change notification settings

samvanstroud/hepattn

Repository files navigation

hepattn

We present a general end-to-end ML approach for particle physics reconstruction by adapting cutting-edge object detection techniques. Our work demonstrates that a single encoder-decoder transformer can solve many different reconstruction problems that traditionally required specialised, task-specific approaches.

Our method has been successfully applied to various reconstruction tasks and detector setups:

  • Pixel cluster splitting - ATLAS [PUB]
  • Hit filtering - TrackML [arXiv], ITk [WIP]
  • Tracking - TrackML [arXiv], ATLAS [PUB]
  • Primary vertexing - Interested in working on this? Get in touch!
  • Secondary vertexing - Delphes [EPJC]
  • Particle flow - CLIC [arXiv]
  • End-to-end reconstruction - CLD [ML4Jets]
  • Muon Tracking - ATLAS [ConnectingTheDots]

✨ Key Features

  • 🏗️ Modular architecture: Encoder, decoder, and task modules for flexible experimentation
  • ⚡ Efficient attention: Seamlessly switch between torch SDPA, FlashAttention, and FlexAttention
  • 🔬 Cutting-edge transformers: HybridNorm, LayerScale, value residuals, register tokens, local attention
  • 🚀 Performance optimised: Full torch.compile and nested tensor support
  • 🧪 Thoroughly tested: Comprehensive tests across multiple reconstruction tasks
  • 📦 Easy deployment: Packaged with Pixi for reproducible environments

🛠️ Setup

First clone the repository:

git clone git@github.com:samvanstroud/hepattn.git
cd hepattn

We recommend using a container to set up and run the code. This is necessary if your system's libc version is <2.28 due to requirements of recent torch versions. We use pixi's CUDA image, which you can access with:

apptainer pull pixi.sif docker://ghcr.io/prefix-dev/pixi:0.54.1-jammy-cuda-12.8.1
apptainer shell --nv pixi.sif

📝 Note: If you are not using the pixi container, you will need to make sure pixi is installed according to https://pixi.sh/latest/installation/.

You can then install the project with locked dependencies:

pixi install --locked

📝 Note: The default environment targets GPU machines and installs FA2. See the pyproject.toml or setup/isambard.md for more information.

🌟 Activating the Environment

To run the installed environment, use:

pixi shell

Multiple environments are configured in pyproject.toml for different hardware setups and experiments (default, cpu, isambard, clic, tide, ci). Use -e <env> to specify a specific environment.

You can close the environment with exit. See the pixi shell docs for more information.

🧪 Running Tests

Once inside the environment, if a GPU and relevant external data are available, just run:

pytest

To test parts of the code that don't require a GPU, run:

pytest -m 'not gpu'

To test parts of the code that don't require external input data, run:

pytest -m 'not requiresdata'

The current CI only tests the parts of the code that don't require a GPU or external input data:

pytest -m 'not gpu and not requiresdata'

📝 Note: If you encounter import errors for missing modules like numba when running tests in the default environment, switch to the appropriate experiment environment or use the ci environment which includes all required dependencies for tests (e.g. pixi run -e ci pytest -m 'not gpu and not requiresdata').

🏃 Run Experiments

See experiment directories for instructions on how to run experiments.

📖 Terminology

To ensure clarity and consistency throughout this project, we use the following definitions:

  • constituent - input entities that go into the encoder/decoder, e.g. inner detector hits
  • object - reconstructed outputs from the decoder, e.g. reconstructed charged particle tracks
  • input - (also input_object) generic term for any input to a module (could be constituents, objects, etc)
  • output - generic term for any output from a module (could be objects, predictions, or intermediates)

🤝 Contributing

If you would like to contribute, please lint and format code with

ruff check --fix .
ruff format .

You can also set up pre-commit hooks to automatically run these checks before committing:

pre-commit install

📄 Citing

If you use this software in your research, please cite it using the citation information available in the GitHub repository sidebar (generated from CITATION.cff). Please also cite our papers if they are relevant to your work.

About

Unified end-to-end ML reconstruction for particle physics

Resources

License

Stars

Watchers

Forks

Contributors 11