DRIFT is a scalable diffusion framework that denoises expression profiles and integrates the spatial topology of ST data into existing pretrained scRNA-seq and ST foundation models without additional retraining. Foundation models that do not explicitly model spatial information benefit from both denoising and spatial integration, while methods that do so leverage DRIFT's denoised output. DRIFT constructs a spatial adjacency graph among tissue spots and applies a heat-kernel diffusion process that propagates gene-expression signals across local neighborhoods while preserving tissue boundaries. This produces spatially coherent yet biologically meaningful representations that can be directly embedded into pretrained foundation models without retraining, making our approach much more computationally scalable and accessible.
To run the DRIFT step, you require the following libraries:
scanpy >= 1.9.1
numpy < 2.0.0
scipy
networkx
Python Optimal Transport >= 0.9.1
We suggest generating an environment (such as conda) to run the code. You can create the required conda environment directly by running the following lines sequentially in the shell.
conda create --name <env_name> python==3.11
conda activate <env_name>
pip install scanpy
pip install POT
pip install "numpy<2"
pip install scipy
pip install networkx
pip install pycpd
You will require additional libraries dependent on the foundation model you aim to use. Please refer to their code for any additional requirements necessary.
To see how to run the code, please check our notebook tutorials.
run_drift.ipynb for running DRIFT to obtain diffused inputs.
run_annotation.ipynb for running annotation code.
run_alignment.ipynb for running alignment code.
For clustering, you need the embeddings from your foundation model. The embeddings can then be used in any clustering algorithm. For our work we used the mclust library in R.
