SpecPool: How does combining Laplacian and raw node embeddings influence GINs' performance in graph classification?
Candidate number: 1088143
This project explores the use of a novel readout function, `SpecPool', in graph isomorphism networks (GINs), as introduced in the submitted paper for the HT2025 Geometric Deep Learning Exam.
This repository contains the model implementations, the original experiments conducted in jupyter notebook, plus a CLI for users to interact with the implemented models.
The CLI allows users to customise various parameters for model training and testing, and visualising the graph representations generated by the models.
To run this experiment, ensure you have the following dependencies installed:
- Python (3.7 or higher): This project is compatible with Python 3.7 or later.
- Required Python Packages: These packages can be installed using
pip.
pip install -r requirements.txtThe experiment can be run using the command line. Below is the basic usage pattern:
python run.py --config <path_to_config_file> [other optional arguments]Where <path_to_config_file> refers to the path of a YAML configuration file containing default parameters. You can specify any additional command-line arguments to override specific configurations. We provide a sample configuration at: ./default_config.yaml
-
--config <path_to_yaml_file>: Path to the YAML configuration file. This is a required argument unless all parameters are provided via the command line. -
--pooling_type <pooling_type>: Pooling type for the readout function. It can be one of the following:sum: Sum poolingmean: Mean poolingmax: Max poolingspec:SpecPool
-
--pool1 <pooling_type>: First pooling function in SpecPool. It can be one of:sum, mean, max -
--pool2 <pooling_type>: Second pooling function in SpecPool. It can be one of:sum, mean, max
-
--dataset <dataset_name>: The dataset name to use for training. Choose from:MUTAG, NCI1, PROTEINS -
--lr <learning_rate>: Learning rate for the optimizer (e.g., 0.001). -
--weight_decay <weight_decay>: Weight decay for regularization (e.g., 0.0005). -
--epochs <num_epochs>: Number of epochs to run the training (e.g., 200). -
--patience <patience>: Maximum consecutive number of epochs for which validation accuracy may decrease before training stops (e.g., 20). -
--batch <batch_size>: Batch size for mini-batching during training (e.g., 32). -
--layers <layers>: Model depth, specifying the number of layers in the model (e.g., 4). -
--jk <jumping_knowledge_type>: Type of jumping knowledge used in the model. Choose from:sum: Node representations from all GNN layers are summed up.last: Only the node representation from the last GNN layer is used.
-
--dropout <dropout_ratio>: Dropout ratio to apply in the model (value in the range [0, 1]) (e.g., 0.2).
-
--test_all: If specified, it will test all models at once rather than just the configured model. -
--visualize: If specified, the graph representations generated by the model will be visualized.
Here are some examples of how to use the SpecPool experiment CLI:
-
Training with the default configuration file:
python run.py --config "./default_config.yaml" -
Training with custom parameters overriding the configuration file:
python run.py --config "./default_config.yaml" --dataset PROTEINS --lr 0.001 --epochs 100 --batch 64 --pooling_type sum --test_all -
Training with SpecPool-specific pooling types:
python run.py --config "./default_config.yaml" --pool1 mean --pool2 sum -
Visualising the graph representations after training and testing:
python run.py --config "./default_config.yaml" --visualize
For CLI:
models: Python module for model implementations, training and testing modelsutils: Python module for miscellaneous utility functions (e.g., visualisations)run.py: Entry point for the programdefault_config.yaml: Sample config to be passed to the CLIrequirement.txt: A list of dependencies required for the CLI
Our own experiments:
data: datasets for training the modelsvisuals: visualisations for our experimental resultsjupyter-notebook: Jupyter notebooks where our original experiments are conductedimgs: resources for figures used in the paper
SpecPool is released under the MIT License.