A spatial multi-omics integration framework that combines dual-graph encoding, cross-modal interaction-guided fusion, feature-graph refinement, and self-supervised structural regularization for clustering-oriented representation learning.
Spatial multi-omics technologies enable the joint measurement of complementary molecular modalities within intact tissues, providing new opportunities to characterize cellular heterogeneity and spatial organization in situ. However, integrating heterogeneous modalities for unsupervised clustering and spatial domain identification remains challenging because of modality-specific noise, heterogeneous feature structures, and the need to preserve spatial context. Existing methods often rely on coarse fusion strategies or insufficient structural constraints, limiting their ability to capture cross-modal dependencies and maintain coherent latent organization.
SPAMO addresses these challenges through:
- Dual-Graph Encoding — Per-modality 2-layer GCN encoders with adaptive spatial-feature graph blending (
AdaptiveAdjFusion) - Cross-Modal Interaction — Lightweight cross-modal attention enabling inter-modality information exchange, followed by gated MLP fusion
- Feature-Graph Refinement — Learnable parameterized adjacency matrices updated via EMA, allowing iterative refinement of feature graphs during training
- Self-Supervised Structural Regularization — Deep Graph Infomax (DGI) loss and spatial smoothness regularization to preserve graph topology without external labels
We evaluate SPAMO on benchmark datasets including Human Lymph Node, Mouse Brain, and simulated multi-modality settings. Across these datasets, SPAMO shows improved performance over strong baselines on the main clustering metrics and achieves competitive results across diverse settings.
Input (RNA + ADT/ATAC, or RNA + ADT + ATAC for 3-modal)
│
├── RobustEncoder (per modality)
│ ├── 2-layer GCN with dropout
│ ├── AdaptiveAdjFusion (learned spatial–feature graph blend)
│ └── LayerNorm
│
├── RobustFusionModule
│ ├── LightCrossModalAttention (bidirectional)
│ ├── Gated MLP fusion
│ └── Residual connection with average pooling
│
├── RobustDecoder (per modality)
│ └── Single-layer GCN reconstruction
│
└── Loss Functions
├── Weighted MSE reconstruction loss
├── Feature-graph Frobenius norm regularization (EMA target)
├── DGI self-supervised loss (node–global MI maximization)
└── Spatial smoothness regularization
Part of the preprocessing code is derived from SpatialGlue (Long et al. 2024).
- python >= 3.8
- torch >= 2.0
- anndata == 0.8.0
- numpy == 1.22.3
- pandas == 1.4.2
- rpy2 == 3.4.1
- scanpy == 1.9.1
- scikit-learn == 1.1.1
- scikit-misc == 0.2.0
- scipy == 1.8.1
- scvi == 0.6.8
The above packages are the main packages used for the experiments. Most PyTorch 2.0+ environments can run the code directly.
Please download Human Lymph Node dataset (Long et al. 2024) and spatial epigenome–transcriptome mouse brain dataset (Zhang et al. 2023) from https://zenodo.org/records/14591305, and unzip them into ./Data/.
The expected data directory layout:
Data/
├── HLN/
│ ├── adata_RNA.h5ad
│ └── adata_ADT.h5ad
├── Mouse_Brain/
│ ├── adata_RNA.h5ad
│ └── adata_peaks_normalized.h5ad
└── Simulation/
├── adata_RNA.h5ad
├── adata_ADT.h5ad
└── adata_ATAC.h5ad
Create and activate a Python virtual environment with Anaconda:
conda create -n spamo python=3.8
conda activate spamoInstall packages:
pip install torch scanpy scikit-learn numpy anndata rpy2 scikit-misc scipy scvi-toolsTo reproduce all benchmark results (HLN, Mouse Brain, Simulation):
sh run.shThe quantification results and visualizations will be saved in the ./results/ directory.
Human Lymph Node (RNA + ADT):
python main.py \
--file_fold ./Data/HLN --data_type 10x \
--n_clusters 10 --init_k 10 --KNN_k 20 \
--RNA_weight 5 --ADT_weight 5 \
--dgi_weight 0.1 --spatial_weight 0.01 \
--epochs_override 200 --optimizer_type sgd \
--random_seed 2025 \
--vis_out_path results/HLN.png \
--txt_out_path results/HLN.txtMouse Brain (RNA + ATAC):
python main.py \
--file_fold ./Data/Mouse_Brain --data_type Spatial-epigenome-transcriptome \
--n_clusters 14 --init_k 14 --KNN_k 20 \
--RNA_weight 1 --ADT_weight 10 \
--dgi_weight 0.1 --spatial_weight 0.01 \
--epochs_override 300 --optimizer_type adamw --lr_scheduler_type cosine \
--random_seed 2025 \
--vis_out_path results/MB.png \
--txt_out_path results/MB.txtSimulation (RNA + ADT + ATAC, triple modality):
python main.py \
--file_fold ./Data/Simulation --data_type Simulation \
--n_clusters 5 --init_k 5 --KNN_k 20 \
--RNA_weight 5 --ADT_weight 5 \
--random_seed 2025 \
--vis_out_path results/Sim.png \
--txt_out_path results/Sim.txtTo apply SPAMO to your own dataset, ensure that the count matrices from different omics layers are stored in the anndata.AnnData format (.h5ad), and they share the same number of spots/cells and spatial coordinates. Then run:
python main.py \
--file_fold <Path to AnnData directory> \
--data_type <10x | Spatial-epigenome-transcriptome | SPOTS | Stereo-CITE-seq | Simulation> \
--n_clusters <Number of clusters> \
--init_k <Estimated number of clusters> \
--KNN_k 20 \
--RNA_weight <Reconstruction weight for modality 1> \
--ADT_weight <Reconstruction weight for modality 2> \
--dgi_weight <DGI self-supervised loss weight, default 0.1> \
--spatial_weight <Spatial smoothness weight, default 0.01> \
--vis_out_path <Output visualization path, e.g., results/XXX.png> \
--txt_out_path <Output cluster labels path, e.g., results/XXX.txt>| Parameter | Default | Description |
|---|---|---|
--n_clusters |
— | Number of spatial domains for clustering |
--KNN_k |
20 | Number of neighbors for feature graph construction |
--RNA_weight |
5 | Reconstruction weight for RNA modality |
--ADT_weight |
5 | Reconstruction weight for ADT/ATAC modality |
--dgi_weight |
0.1 | Weight of DGI self-supervised loss |
--spatial_weight |
0.01 | Weight of spatial smoothness regularization |
--dim_output |
64 | Latent embedding dimension |
--dropout |
0.1 | Dropout rate |
--epochs_override |
0 | Override training epochs (0 = use dataset default) |
--optimizer_type |
adamw | Optimizer: sgd, adam, or adamw |
--lr_scheduler_type |
none | LR scheduler: none, cosine, or plateau |
--use_cross_attn |
True | Enable cross-modal attention in fusion |
--random_seed |
2025 | Random seed for reproducibility |
SPAMO/
├── main.py # Main entry: data loading, training, clustering, visualization
├── run.sh # Shell script to reproduce all benchmark results
├── bio_analysis.py # Downstream biological analyses (DEG, PAGA, GO enrichment, etc.)
├── cal_matrics.py # Metric evaluation script
├── clustering_utils.py # Split-and-merge clustering utilities
├── metric.py # Evaluation metrics (ARI, NMI, ASW, MAP, etc.)
├── spamo/ # Core package
│ ├── __init__.py
│ ├── model.py # SpaMO model (2-modality): Encoder, Decoder, Fusion, DGI, Spatial Reg.
│ ├── model_3m.py # SpaMO-3M model (3-modality extension)
│ ├── trainer.py # Training loop for 2-modality
│ ├── trainer_3m.py # Training loop for 3-modality
│ ├── preprocess.py # Data preprocessing & graph construction (2-modality)
│ ├── preprocess_3m.py # Data preprocessing & graph construction (3-modality)
│ ├── optimal_clustering.py # Optimal clustering utilities
│ └── utils.py # Clustering (mclust/leiden/louvain) & spatial smoothing
├── results/ # Output directory for results
└── Data/ # Dataset directory
Houcheng Su, Juning Feng, Weicai Long, Yusen Hou, Yanlin Zhang. SPAMO: Spatial Multi-Omics Integration via Dual-Graph Encoding and Cross-Modal Interaction. Information Hub, Hong Kong University of Science and Technology (Guangzhou).
[1] Long, Y.; Ang, K. S.; Sethi, R.; Liao, S.; Heng, Y.; van Olst, L.; Ye, S.; Zhong, C.; Xu, H.; Zhang, D.; et al. 2024. Deciphering spatial domains from spatial multi-omics with SpatialGlue. Nature Methods, 1–10.
[2] Zhang, D.; Deng, Y.; Kukanja, P.; Agirre, E.; Bartosovic, M.; Dong, M.; Ma, C.; Ma, S.; Su, G.; Bao, S.; et al. 2023. Spatial epigenome–transcriptome co-profiling of mammalian tissues. Nature, 616(7955): 113–122.