Skip to content

SpecPool: How does combining Laplacian and raw node embeddings influence GNNs' performance in graph classification?

License

Notifications You must be signed in to change notification settings

7angel4/SpecPool

Repository files navigation

SpecPool: How does combining Laplacian and raw node embeddings influence GINs' performance in graph classification?

Candidate number: 1088143

This project explores the use of a novel readout function, `SpecPool', in graph isomorphism networks (GINs), as introduced in the submitted paper for the HT2025 Geometric Deep Learning Exam.

This repository contains the model implementations, the original experiments conducted in jupyter notebook, plus a CLI for users to interact with the implemented models. The CLI allows users to customise various parameters for model training and testing, and visualising the graph representations generated by the models.

Installation

To run this experiment, ensure you have the following dependencies installed:

  1. Python (3.7 or higher): This project is compatible with Python 3.7 or later.
  2. Required Python Packages: These packages can be installed using pip.
pip install -r requirements.txt

Usage

Command-Line Interface (CLI)

The experiment can be run using the command line. Below is the basic usage pattern:

python run.py --config <path_to_config_file> [other optional arguments]

Where <path_to_config_file> refers to the path of a YAML configuration file containing default parameters. You can specify any additional command-line arguments to override specific configurations. We provide a sample configuration at: ./default_config.yaml

CLI Arguments

Arguments for the model configuration:

  • --config <path_to_yaml_file>: Path to the YAML configuration file. This is a required argument unless all parameters are provided via the command line.

  • --pooling_type <pooling_type>: Pooling type for the readout function. It can be one of the following:

    • sum: Sum pooling
    • mean: Mean pooling
    • max: Max pooling
    • spec: SpecPool
  • --pool1 <pooling_type>: First pooling function in SpecPool. It can be one of: sum, mean, max

  • --pool2 <pooling_type>: Second pooling function in SpecPool. It can be one of: sum, mean, max

Arguments for the configuration file:

  • --dataset <dataset_name>: The dataset name to use for training. Choose from: MUTAG, NCI1, PROTEINS

  • --lr <learning_rate>: Learning rate for the optimizer (e.g., 0.001).

  • --weight_decay <weight_decay>: Weight decay for regularization (e.g., 0.0005).

  • --epochs <num_epochs>: Number of epochs to run the training (e.g., 200).

  • --patience <patience>: Maximum consecutive number of epochs for which validation accuracy may decrease before training stops (e.g., 20).

  • --batch <batch_size>: Batch size for mini-batching during training (e.g., 32).

  • --layers <layers>: Model depth, specifying the number of layers in the model (e.g., 4).

  • --jk <jumping_knowledge_type>: Type of jumping knowledge used in the model. Choose from:

    • sum: Node representations from all GNN layers are summed up.
    • last: Only the node representation from the last GNN layer is used.
  • --dropout <dropout_ratio>: Dropout ratio to apply in the model (value in the range [0, 1]) (e.g., 0.2).

Additional options:

  • --test_all: If specified, it will test all models at once rather than just the configured model.

  • --visualize: If specified, the graph representations generated by the model will be visualized.

Example Usage

Here are some examples of how to use the SpecPool experiment CLI:

  1. Training with the default configuration file:

    python run.py --config "./default_config.yaml"
  2. Training with custom parameters overriding the configuration file:

    python run.py --config "./default_config.yaml" --dataset PROTEINS --lr 0.001 --epochs 100 --batch 64 --pooling_type sum --test_all
  3. Training with SpecPool-specific pooling types:

    python run.py --config "./default_config.yaml" --pool1 mean --pool2 sum
  4. Visualising the graph representations after training and testing:

    python run.py --config "./default_config.yaml" --visualize

Repository structure

For CLI:

  • models: Python module for model implementations, training and testing models
  • utils: Python module for miscellaneous utility functions (e.g., visualisations)
  • run.py: Entry point for the program
  • default_config.yaml: Sample config to be passed to the CLI
  • requirement.txt: A list of dependencies required for the CLI

Our own experiments:

  • data: datasets for training the models
  • visuals: visualisations for our experimental results
  • jupyter-notebook: Jupyter notebooks where our original experiments are conducted
  • imgs: resources for figures used in the paper

License

SpecPool is released under the MIT License.

About

SpecPool: How does combining Laplacian and raw node embeddings influence GNNs' performance in graph classification?

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published