- ๐ HyperGraph Magic: Unraveling Jets with HMPNNs
- Hypergraph Message Passing Neural Networks
- Unveiling the Subnuclear World: Exploring Particle Physics Beyond the Standard Model
- 3.1 Key Points
- Research Objectives and Research Questions
- ๐ "Houston, we have jets!" ๐ฐ๏ธ
- โ๏ธ Quantum Chronicles of QCD
- ๐ Stars in the Jet Constellation
- ๐ช Pioneering Particle Performances
- 8.1 Strengths
- 8.2 Limitations
- 8.3 Applications
- ๐ Beam Me Up, Scotty!
- ๐ช Beyond the Stars: Cosmic Fellowship
- Acknowledgments
- โก Cosmic Code
- โก Warp-Speed License
The field of high-energy particle collision analysis has witnessed a surge in the utilization of machine learning techniques, particularly neural networks. These algorithms offer the potential to glean valuable insights from the intricate data resulting from these collisions. However, traditional neural network architectures like CNNs and RNNs fall short when dealing with the intricate and structured nature of such data.
In recent times, the spotlight has turned to Graph Neural Networks (GNNs) for analyzing graph-structured data. Their triumphs span across computer vision, natural language processing, and other domains. Yet, the customary GNN framework operates under the assumption of binary, unordered edgesโa limitation when applied to jet analysis.
Enter Hypergraphsโa dynamic, expressive alternative to represent jet data. Unlike graphs, hypergraphs embrace hyperedges capable of linking multiple nodes and sporting multiple labels. Hypergraph Neural Networks (HGNNs) have emerged as a solution to extend GNNs to hypergraphs. However, their potential in jet analysis remains relatively untapped.
In this context, Hypergraph Message Passing Neural Networks (HMPNNs) come into play. A novel spin on HGNNs, HMPNNs leverage message passing algorithms to enhance node and hyperedge features within a hypergraph. This approach, successful in domains like image segmentation and social network analysis, is ripe for exploration in jet analysis.
Through this innovative approach, HMPNNs hold the promise to unlock hidden insights within jet analysis data, ultimately contributing to a deeper understanding of the complex world of particle collisions.
This project delves into Beyond Standard Model (BSM) Particle Physics. We analyze particle jets generated in high-energy LHC collisions. Jets, collimated particle sprays, emerge from quarks and gluons combining to form hadrons, shedding light on fundamental particle properties and interaction forces.
- LHC investigates matter's subnuclear structure through proton collisions.
- While the Standard Model explains fundamental forces, it's incomplete.
- BSM Particle Physics explores jet data using novel techniques.
- Particle jets provide insights into fundamental particle properties.
- HMPNNs promise to unveil new insights by deciphering jet data relationships.
- Develop a hypergraph message passing neural network architecture for analyzing jets in high-energy particle collisions.
- Assess the performance of the proposed hypergraph model.
- Interpret the results, highlighting their significance in jet analysis, and explore the model's strengths and limitations.
- Suggest future enhancements and research paths for hypergraph message passing neural networks in jet analysis and related domains.
- How can hypergraph message passing neural networks effectively analyze jets in high-energy particle collisions?
- How does the accuracy and computational efficiency of the proposed hypergraph model compare to other leading approaches?
- What insights can be derived from the model's outcomes, and how do they impact the realm of high-energy physics jet analysis?
- What are the discernible strengths and limitations of the proposed model, and what avenues for improvement are worth investigating in future studies?
Feel free to refer to the complete research proposal for comprehensive details.
๐ Welcome to the journey of HyperGraph Message Passing Neural Networks (HMPNNs) exploring the mesmerizing realm of high-energy particle collisions! Strap in, because we're about to decode the symphony of particles through a fusion of physics and machine learning. ๐คฏ
๐ฌ Quantum Chromodynamics (QCD), the tale of strong quark-gluon interactions, sets the stage for our journey:
4.mp4
๐ฅ "Particle Aria" - Jets emerge as sparkling sprays of particles in high-energy collisions, their songs resonating through the cosmos. ๐
โ๏ธ "QCD's Cosmic Dance" - QCD jets, birthed from quarks and gluons, converse a different cosmic language compared to non-QCD jets. Their energy, multiplicity, and dance steps set them apart. ๐
๐ช "Quantum Properties" - Jet classification unveils their essence: the jet mass, substructure, energy, and a plethora of features shape their cosmic choreography. ๐ฐ๏ธ
๐ "Our Celestial Ensemble" -
The dataset twinkles with MC simulated events, unraveling top quark tagging mysteries. 1.2M training events, 400k validation events, and 400k test events make up our cosmic ensemble. ๐ญ
The data has been produced using Monte Carlo simulations. The first 21 features (columns 2-22) are kinematic properties measured by the particle detectors in the accelerator. The last seven features are functions of the first 21 features; these are high-level features derived by physicists to help discriminate between the two classes. There is an interest in using deep learning methods to obviate the need for physicists to manually develop such features. Benchmark results using Bayesian Decision Trees from a standard physics package and 5-layer neural networks are presented in the original paper. The last 500,000 examples are used as a test set.
This dataset contains two sets of jet data generated using Pythia 8, representing quark and gluon jets. There are two versions of the dataset: one that includes all kinematically realizable quark jets and another that excludes charm and bottom quark jets at the level of the hard process. The generation parameters for these datasets are as follows:
- Pythia Version: 8.226 (without bc jets), 8.235 (with bc jets)
- Center-of-Mass Energy: โs = 14 TeV
- Quark Source: WeakBosonAndParton:qg2gmZq
- Gluon Source: WeakBosonAndParton:qqbar2gmZg (with Z boson decaying to neutrinos)
- Jet Algorithm: FastJet 3.3.0, anti-kt algorithm with R=0.4
- Transverse Momentum Range: pjetT โ [500, 550] GeV
- Pseudorapidity Range: |yjet| < 1.7
Each dataset consists of 20 files, stored in compressed NumPy format. Files that include charm and bottom jets have 'withbc' in their filename. Each file contains two arrays:
- X (Features): A 3-dimensional array of shape (100000, M, 4), where M is the maximum multiplicity of jets in the file. The array represents a mix of 50,000 quark jets and 50,000 gluon jets, randomly sorted. Each particle in a jet is described by four features: transverse momentum (pt), rapidity, azimuthal angle, and pdgid (particle ID).
- y (Labels): An array of shape (100000,), providing labels for the jets. A label of 0 corresponds to gluon jets, and a label of 1 corresponds to quark jets.
If you use this dataset, kindly cite the following sources:
- Zenodo Record: Link
- Corresponding Paper: P. T. Komiske, E. M. Metodiev, J. Thaler, "Energy Flow Networks: Deep Sets for Particle Jets," JHEP 01 (2019) 121, arXiv:1810.05165.
For the corresponding Herwig jet dataset, you can find it on this Zenodo Record.
To work with these datasets in Python, you can use the EnergyFlow Python package for automatic download and reading.
This dataset serves as a reference for the evaluation of top quark tagging architectures and includes MC simulated training/testing events. The dataset has been prepared by Kasieczka, Gregor; Plehn, Tilman; Thompson, Jennifer; Russel, Michael.
- Total Training Events: 1.2 million
- Total Validation Events: 400,000
- Total Test Events: 400,000
Use the following labels to distinguish different purposes:
train: Training eventsval: Validation events during trainingtest: Final testing and reporting results
- Energy: 14 TeV
- Signal: Hadronic tops
- Background: QCD dijets
- Detector Simulation: Delphes ATLAS detector card with Pythia 8
- No MPI/pile-up included
- Jet Clustering: Particle-flow entries (produced by Delphes E-flow) clustered into anti-kT 0.8 jets
- Jet Transverse Momentum Range: [550, 650] GeV
- Jet Eta Range: |eta| < 2
- Jet Matching: All top jets matched to a parton-level top within โR = 0.8 and to all top decay partons within 0.8
- Jet Constituents: Leading 200 jet constituent four-momenta stored with zero-padding for jets with fewer than 200 constituents
- Constituent Sorting: Constituents sorted by pT, highest pT first
- Truth Top Four-Momentum: Stored as truth_px, truth_py, truth_pz, truth_e
- Jet Classification: A flag
is_signal_newprovided for each jet (1 for top, 0 for QCD) - Dataset Classification: Variable
ttv(= test/train/validation) indicates the dataset a jet belongs to
If you use this dataset for your research, please cite the creators:
- Kasieczka, Gregor; Plehn, Tilman; Thompson, Jennifer; Russel, Michael
๐ **"Particle Puzzle Pieces"** - The dataset embodies hadronic tops for the signal, QCD diets background, Pythia8's ATLAS detector card, and the Pythia 8-generated quark and gluon jet datasets. Each piece holds a cosmic puzzle. ๐งฉ
๐ช "Hypergraph Voyage" - We steer the cosmic ship of Hypergraph Message Passing, crafting graphs in the (ฮท, ฯ)-plane and passing cosmic messages to weave cosmic insights. ๐ข
๐ฉ "Magic Four-Vectors" - Directions sculpt our message weights, as four-vectors dance with information exchange, painting the cosmic symphony of jet features. ๐
Quark-Gluon Dataset Features
The Pythia8 Quark and Gluon Jets dataset contains the following features:
| Feature Name | Data Type | Description |
|---|---|---|
| pt | Float | Transverse momentum |
| eta | Float | Pseudorapidity |
| phi | Float | Azimuthal angle |
| mass | Float | Invariant mass |
| b-tag | Bool | b-tagging information |
| particle ID | Int | ID of the particle |
| charge | Int | Charge of the particle |
| isQuark | Bool | True if quark, False if gluon |
| label | Int | 0 for gluon, 1 for quark |
Top Quark Tagging Dataset Features
The Top Quark Tagging Dataset contains the following features:
| Feature | Data Type | Description |
|---|---|---|
| Event ID | Categorical | Unique identifier for the event |
| Jet ID | Categorical | Unique identifier for the jet |
| number of tracks | Numeric | Number of charged particle tracks in the jet |
| number of SVs | Numeric | Number of secondary vertexes associated with the jet |
| jet energy/mass/width/sd_mass | Numeric | Various properties of the jet |
| track 1-3 d0/d0Err/z0/z0Err | Numeric | Impact parameters and associated errors of tracks |
| track 1-3 pt/eta/phi/e/charge | Numeric | Kinematic and charge properties of tracks |
| SV 1-3 flight distance/flight distance error/mass/energy ratio | Numeric | Properties of secondary vertexes |
| is_signal_new | Binary | Binary indicator of whether the jet is a top quark or not |
๐งโโ๏ธ "Neural Cosmic Oracle" - A cosmic climax ensues as our cosmic representation reaches the cosmic Neural Network Oracle. The oracle's verdict unveils the cosmic binary classification score, decoding QCD and non-QCD jets' cosmic essence. ๐
In the grand theater of high-energy collisions, the particle jets dance with complexity and mystery. These performances are captivating, yes, but often a conundrum. Fear not, for our cast of Neural Networks are here to decipher the enigmatic jets with precision! ๐
๐ค "Classifier Extraordinaire!" - Our algorithm shines in distinguishing jet types, unraveling secrets essential for a myriad of physics analyses. ๐ต๏ธโโ๏ธ
๐งโโ๏ธ "Magic of IRC Safety!" - With a sprinkle of physics, our HMPNNs honor IRC safety, ensuring predictions stay steadfast even in the face of soft emissions. ๐ช
๐ก "Navigating the Jet Stream!" - Taming the QCD radiation dragon, our neural wizards stay cool while venturing into the vast jet landscape. ๐
๐ฅ "Scalable Sorcery!" - Be it tiny jets or colossal ones, our HMPNNs flaunt the magic of scalability, adapting to various jet sizes and types. ๐ฉ
โจ Stellar Performance - Our model shines bright with stellar accuracy and AUC in distinguishing top quarks from QCD jets. The stars align for precision! โจ
Algorithm
Input: QCD and non-QCD jet data
# Preprocess the data
Split data into training and testing sets
# Define the Hypergraph Message Passing Permutation Invariant Neural Network architecture
Define Function: ConstructHypergraph(batch_data)
# Constructs a hypergraph for the given batch of data
...
Define Function: MessagePassing(hypergraph)
# Performs hypergraph message passing
...
Define Function: PermutationInvariant(features)
# Computes the permutation invariant representation
...
# Define the Neural Network architecture
Define Function: ClassificationNN(input_dim, hidden_dim, output_dim)
# Defines the classification neural network architecture
...
# Training
For each epoch in range(num_epochs):
For each batch_data in training_data:
hypergraph = ConstructHypergraph(batch_data)
MessagePassing(hypergraph)
representation = PermutationInvariant(features)
classification_output = ClassificationNN(representation)
loss = CalculateLoss(classification_output, labels)
UpdateParameters(loss)
# Classification
For each batch_data in testing_data:
hypergraph = ConstructHypergraph(batch_data)
MessagePassing(hypergraph)
representation = PermutationInvariant(features)
classification_output = ClassificationNN(representation)
predicted_labels = ApplySoftmax(classification_output)
final_labels = Classify(predicted_labels)
Output: Predicted class labels for testing_dataSimple Model Code:
import torch
import torch.nn as nn
import torch.optim as optim
from torch_geometric.data import Data, DataLoader
from torch_geometric.nn import MessagePassing
# Define the Hypergraph Message Passing Permutation Invariant Neural Network
class HypergraphMessagePassingPINN(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
super(HypergraphMessagePassingPINN, self).__init__()
# Define hypergraph construction, message passing, and permutation invariant layers
self.hypergraph_layer = HypergraphConstructionLayer(input_dim, hidden_dim)
self.message_passing_layer = MessagePassingLayer(hidden_dim)
self.permutation_invariant_layer = PermutationInvariantLayer(hidden_dim, output_dim)
def forward(self, data):
# Construct hypergraph
hypergraph = self.hypergraph_layer(data)
# Perform hypergraph message passing
x = self.message_passing_layer(hypergraph)
# Compute permutation invariant representation
representation = self.permutation_invariant_layer(x)
return representation- AUC Values for Gluons vs Quark Tagging Dataset
| Sr. No. | R0 | Accuracy |
|---|---|---|
| 1 | 0.1 | 0.8824ยฑ0.0005 |
| 2 | 0.1 | 0.8888ยฑ0.0013 |
| 3 | 0.2 | 0.8909ยฑ0.0009 |
| 4 | 0.3 | 0.8916ยฑ0.0008 |
| 5 | 0.4 | 0.8919ยฑ0.0006 |
- AUC Values for Top Tagging Dataset
| Sr. No. | R0 | Accuracy |
|---|---|---|
| 1 | 0.1 | 0.9734ยฑ0.0009 |
| 2 | 0.2 | 0.9764ยฑ0.0004 |
| 3 | 0.3 | 0.9779ยฑ0.0005 |
| 4 | 0.4 | 0.9782ยฑ0.0002 |
| 5 | 0.5 | 0.9781ยฑ0.0002 |
- AUC Values for W Tagging Dataset
| Sr. No. | R0 | Accuracy |
|---|---|---|
| 1 | 0.1 | 0.9865ยฑ0.0004 |
| 2 | 0.2 | 0.9864ยฑ0.0004 |
| 3 | 0.3 | 0.9863ยฑ0.0004 |
| 4 | 0.4 | 0.9868ยฑ0.0004 |
| 5 | 0.5 | 0.9868ยฑ0.0005 |
๐ Hypergraph Odyssey - In the land of hypergraphs, our HMPNNs are fearless explorers, traversing multiple nodes, hyperedges, and labels. ๐ฐ๏ธ
๐ Cosmic Radius Reckoning - Tune in for the cosmic dance as we test different values of R, controlling jet radius. Bigger isn't always better, and smaller isn't always wiser. ๐ฎ
From deep within the heart of high-energy physics to the cosmos of machine learning, our journey opens realms of possibility:
-
๐ Jet Pioneering: Elevate QCD and non-QCD jet classification for enhanced high-energy physics experiments. Results that are out of this world! ๐
-
๐ก Collider Enchantment: Enrich collider event simulations with precise jet classification. It's like a magical touch to the particle orchestra. ๐ป
-
๐ Anomaly Alchemy: Detect anomalies in collider data and unlock the secrets of new physics beyond the Standard Model. Spells of discovery are cast! ๐
-
๐ Calibration Chronicles: Jet calibration gets a boost with the wizardry of HMPNNs, ensuring particle property determinations are on point. ๐
-
๐ฉ Innovation Spells: Our HMPNN saga inspires new machine learning techniques, reverberating beyond particle physics into diverse realms. ๐ฌ
-
๐ Astounding Discrimination: The model soars with impressive accuracy and AUC in distinguishing top quarks from QCD jets. Its prowess lays a solid foundation for confident analysis.
-
โ๏ธ Physical Motivation: The model thrives on a physics-driven approach, upholding IRC safety. This ensures outcomes remain unwavering despite the twists and turns of collinear or soft emissions.
-
๐ง Radiation Resilience: Deftly taming QCD radiation, the model maintains numerical stability. It doesn't flinch in the face of intricate complexities.
-
๐ Scalable Brilliance: Flexing its muscle, the model adapts effortlessly to jets of varying sizes. Its versatility extends to other jet species, promising adaptability in the evolving landscape.
-
๐ Guiding Insights: The model isn't just a black box; it's a window into the core features that underscore the art of discriminating top quarks from their QCD counterparts.
-
๐ Narrowed Horizons: Grounded in simulated data, the model's brilliance may dim when faced with the wild terrain of real-world scenarios. Caution is advised in generalization.
-
๐ญ Pattern Presumption: While based on divergent radiation patterns of top quarks and QCD jets, reality can sometimes paint a different picture, potentially curbing the model's versatility.
-
๐ธ๏ธ Complexity Conundrum: The model might occasionally falter in the face of intricate jets, where the complexities weave a web that's tough to unravel.
-
๐ป Computation Capers: As jet sizes swell and networks deepen, computational costs might rise, stretching the model's resource limits.
-
โ Cryptic Predictions: Peering into the model's predictions might resemble deciphering an enigma. Interpretability can be elusive, demanding extra effort to decipher its inner workings.
Our analysis presents a multitude of potential use cases that extend beyond the realm of QCD and non-QCD jet classification. These applications underscore the significance of our findings and pave the way for broader advancements in particle physics research:
-
Empowering High-Energy Physics Experiments: The precision of QCD and non-QCD jet classification holds immense value in high-energy physics experiments. Harnessing the capabilities of HMPNNs within jet analysis can elevate the accuracy and efficacy of machine learning-driven QCD investigations, yielding results that are not just insightful but also steadfastly dependable.
-
Elevating Collider Event Simulations: The categorization of jets stands as a pivotal aspect of collider event simulations. Through our HMPNN-based methodology, we augment the fidelity of jet classification, ushering in a new era of simulations that encapsulate particle collisions with unprecedented accuracy.
-
Unearthing Anomalies in Collider Data: By leveraging the prowess of HMPNNs, our approach reaches beyond mere jet classification to anomaly detection within collider data. Detecting these anomalous events offers the potential for breakthrough discoveries that transcend the bounds of the Standard Model.
-
Enhancing Jet Calibration: Jet calibration hinges on precise jet classification, a cornerstone for the meticulous determination of particle properties. HMPNNs contribute to refined jet calibration, where classifications transcend accuracy, leading to more profound insights.
-
Pioneering Novel Machine Learning Techniques: The introduction of HMPNNs into jet analysis fosters the emergence of innovative machine learning techniques. These techniques have the potential to reshape diverse arenas of particle physics research, expanding their horizon and impact.
Our findings resonate across various domains in particle physics research, including:
- The monumental Large Hadron Collider (LHC) experiments, exemplified by ATLAS and CMS, which grapple with vast datasets that demand meticulous analysis.
- Particle physics research initiatives exploring the enigmatic properties of the Higgs boson, dark matter, and the intriguing realm of supersymmetry. Here, accurate jet classification stands as a cornerstone.
- The frontiers of particle detector development, where the accuracy of particle collision simulations plays a pivotal role in scrutinizing the efficacy of newly devised detectors.
- ๐ธ Clone the repository:
git clone https://github.com/rajveer43/hep_ml.git - ๐ Navigate to the project realm:
cd hep_ml - ๐ช Set up your mystical environment:
pip install -r requirements.txt - ๐ Explore the
images/galaxy for captivating data explorations and spellbinding model training examples.
- Join the Cosmic Circle: A cosmic contribution is a beacon in our galactic journey. Engage through a cosmic pull request and intertwine your cosmic magic!
- Astro-Potion (Issue): For grand cosmic spells, step into the cosmic realm of issues to conjure discussions on cosmic ideas.
This dataset was created and hosted by the following organizations:
We extend our sincere gratitude to these organizations for their invaluable contributions to the field of particle physics and their support in making this dataset available to the research community.
This cosmic journey abides under the Cosmic License, granting cosmic sovereignty in wielding its magic!
This project is enchanted under the MIT License, allowing you to wield its powers with freedom!








