ACES-GNN

This repository contains the code for the paper "ACES-GNN: Can Graph Neural Network Learn to Explain Activity Cliffs".

Dependencies

The code was developed and tested on Python 3.10.10 using CUDA 11.4 with the following Python packages installed:

pytorch            1.12.1
torch-geometric    2.3.0
scikit-learn       1.2.2
captum             0.7.0
hyperopt           0.2.7
rdkit              2023.9.4

Dataset Preparation

Place your datasets in the Data/ directory. The expected structure is:

Data/
├── CHEMBL214_Ki/
├── [other_datasets]/
└── ...

Configuration

Configuration files should be placed in the configs/nn_configs/ directory. These files contain hyperparameters and model settings for different experiments.

Basic Usage

To run experiments, use the main script with appropriate parameters (see parsing.py under XACs/utils/ for details):

python main.py --dataset [dataset_name] \
               --config_dir [config_directory] \
               --data_dir [data_directory] \
               --model_dir [model_directory] \
               --loss [loss_type] \
               --sim_threshold [similarity_threshold] \
               --dist_threshold [potency_distance_threshold] \
               --conv_name [backbone]

Example Commands

Train and test MPNN with supervision loss for CHEMBL214_Ki dataset:

python main.py --dataset 'CHEMBL214_Ki' \
               --config_dir './configs/nn_configs' \
               --data_dir 'Data' \
               --model_dir './checkpoints/' \
               --loss 'MSE+direction' \
               --sim_threshold 0.9 \
               --dist_threshold 1.0 \
               --conv_name 'nn'

Reproducing Paper Experiments

Batch Experiments with SLURM

For reproducing the full experiments from the paper, use the run_task.py script which automates running experiments across multiple datasets and model configurations:

python run_task.py

Important: Before running, update the conda environment name in run_task.py:

sub_sh.write('conda activate YOUR_ENVIRONMENT_NAME\n')  # Change to your environment name

This script will:

Run experiments with different GNN backbones (GAT, GINE, MPNN)
Test both loss functions ('MSE' and 'MSE+direction')
Process all datasets defined in XACs.utils.const.DATASETS
Submit jobs to SLURM scheduler (modify SLURM parameters as needed for your cluster)
Organize results in ./results/results_cv/ and ./results/results_cv_x/ directories

Manual Batch Execution

If you don't have SLURM, you can modify run_task.py to run experiments sequentially by replacing the subprocess.call(['sbatch', ...]) lines with direct execution.

Results Analysis

The notebooks/ folder contains Jupyter notebooks for analyzing experimental results:

Data visualization and statistical analysis
Performance comparison across different models
Activity cliff explanation analysis
Reproduction of paper figures and tables

To use the notebooks:

jupyter notebook notebooks/

Project Structure

XACs/
├── main.py                      # Main entrypoint (cross-validation train/eval)
├── run_task.py                  # Batch runner (SLURM submission helper)
├── hypertune.py                 # Hyperparameter tuning (hyperopt + grid search)
├── requirement.txt              # Python dependencies (see Dependencies section)
├── Data/                        # Dataset directory (place datasets here)
├── configs/                     # Per-backbone configuration files
│   ├── nn_configs/
│   ├── gat_configs/
│   └── gine_configs/
├── results/                     # Outputs/checkpoints (e.g., results_cv/, results_cv_x/)
├── notebooks/                   # Jupyter notebooks for analysis/plots
├── XACs/                        # Core library code (models, training, evaluation, utilities)
│   ├── attribution/             # Attribution methods (e.g., GradCAM)
│   ├── models/                  # GNN backbones/architectures
│   └── utils/                   # CLI parsing, constants, helpers
└── README.md

Output

Model checkpoints are saved in the --model_dir directory
Training logs and results are typically saved alongside the checkpoints
Batch experiment results are organized in ./results/ with subdirectories for different loss functions and datasets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ACES-GNN

Dependencies

Dataset Preparation

Configuration

Basic Usage

Example Commands

Reproducing Paper Experiments

Batch Experiments with SLURM

Manual Batch Execution

Results Analysis

Project Structure

Output

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
Data		Data
XACs		XACs
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hypertune.py		hypertune.py
main.py		main.py
requirement.txt		requirement.txt
run_task.py		run_task.py

License

Liu-group/XACs

Folders and files

Latest commit

History

Repository files navigation

ACES-GNN

Dependencies

Dataset Preparation

Configuration

Basic Usage

Example Commands

Reproducing Paper Experiments

Batch Experiments with SLURM

Manual Batch Execution

Results Analysis

Project Structure

Output

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages