About

In this study, we present GeneMamba, a foundational model designed to advance single-cell analysis, please see our preprint paper:

Installation

Install the requirements:

cd path/to/GeneMamba
conda create -n "genemamba" python=3.9.19
conda activate genemamba
pip install -r requirements.txt

If you encounter conflicting packages, please adjust the version of the corresponding library or exclude some to proceed smoothly.

We suggest use SLURM to enable parallel computing.

Please make sure that you have nvcc installed, and the recommended version is CUDA/12.4.0

If you are using slurm, you may need

module load cuda/12.4.0
module load git/2.33.1

Usage

For pretraining the GeneMamba model, we recommend using at least 200GB of memory and 4 GPUs to optimize the training process.

For downstream tasks, a machine with 10GB of memory and a single GPU should be sufficient.

Pretraining

Due to the large volume of the pretraining dataset, you can

Manually download the datasets fby using cellxgene api;

or

To quickly run the experiment, download the sample dataset from the link https://drive.google.com/drive/folders/1R_L3-ivnrsupHeDSkFugjCr1AwLEzdyL?usp=sharing, and put it into the datasets/pretrain/processed folder.

Then modify the model_path in the pretrain/training.sh to your local path, and run the script by

cd pretrain
./training.sh

Downstream tasks

Under the example folder, there are scripts to run the downstream tasks.

First download the data from the link https://drive.google.com/drive/folders/1R_L3-ivnrsupHeDSkFugjCr1AwLEzdyL?usp=sharing, and put all the datasets under the datasets/downstream folder.

Then, for each task, change the path arguments to your local path, and run the run.sh script, this will output the results all in the results folder under each task directory.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ckpts		ckpts
datasets		datasets
examples		examples
figures		figures
genemamba		genemamba
pretrain		pretrain
.gitattributes		.gitattributes
.gitignore		.gitignore
ALGORITHM.md		ALGORITHM.md
ALGORITHM.tex		ALGORITHM.tex
README.md		README.md
config.json		config.json
gene_tokenizer.json		gene_tokenizer.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Installation

Usage

Pretraining

Downstream tasks

About

Uh oh!

Releases

Packages

Languages

QinLab/GeneMamba

Folders and files

Latest commit

History

Repository files navigation

About

Installation

Usage

Pretraining

Downstream tasks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages