Description

Snakemake pipeline for merging, normalisation and integration of multisample/multimodal single cell datasets

Modularised workflow can be modified and/or extended for different experiment designs
Add as a submodule in a bioinformatics project GitHub repository

git submodule add https://github.com/redwanfarooq/single_cell_multi single_cell_multi

Update submodule to the latest version

git submodule update --remote single_cell_multi

Required software

Global environment
Specific modules
- R >=v4.3
  - docopt v0.7.1
  - logger v0.3.0
  - qs v0.26.3
  - hdf5r v1.3.11
  - tidyverse v2.0.0
  - furrr v0.3.1
  - MatrixExtra v0.1.15
  - Seurat v5.1.0
  - Signac v1.14.0
  - harmony v1.2.1
  - Bioconductor v3.18
    - DropletUtils
    - batchelor
    - ensembldb
    - AnnotationDbi
    - EnsDb.Hsapiens.v86
    - org.Hs.eg.db
    - GenomicRanges
    - GenomeInfoDb
    - glmGamPoi
- MACS2 >=v2.2.9
- scvi-tools >=v1.3.2

Setup

Install software for global environment (requires Anaconda or Miniconda - see installation instructions)
- Download environment YAML
- Create new conda environment from YAML
```
conda env create -f snakemake.yaml
```
Install software for specific module(s)
- Manually install required software from source and check that executables are available in PATH (using which) and/or
- Create new conda environments with required software from YAML (as above - download environment YAMLs) and/or
- Check that required software is available to load as environment modules (using module avail)
Set up pipeline configuration file config/config.yaml (see comments in file for detailed instructions)
Set up profile configuration file profile/config.yaml (see comments in file for detailed instructions)

Run

Activate global environment

conda activate snakemake

Execute run.py in root directory

Input

Pipeline requires the following input files/folders:

General

REQUIRED:

Post-QC multimodal single cell count matrices in 10x-formatted HDF5 file for each sample
Cell barcode metadata table in delimited file format (e.g. TSV, CSV) for each sample
10x-formatted indexed fragments file for each sample with ATAC data (if applicable)
Input metadata table in delimited file format (e.g. TSV, CSV) with the following required fields (with headers):

sample_id: sample ID
hdf5: path to 10x-formatted HDF5 file
metadata: path to cell barcode metadata table
fragments: path to 10x-formatted indexed fragments file (if applicable)
summits: path to MACS2 peak summits BED file (if applicable)

Output

Output directory will be created in specified locations with subfolders containing the output of each step specified in the module

Modules

Available modules

default

Adding new module

Add entry to module rule specifications file config/modules.yaml with module name and list of rule names
Add additional rule definition files in modules/rules folder (if needed)

Rule definition file must also assign a list of pipeline target files generated by the rule to a variable with the same name as the rule
Rule definition file must have the same file name as the rule with the file extension .smk

Execute run.py in root directory with --update flag (needs to be repeated if there are any further changes to the module rule specification in config/modules.yaml)

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
config		config
profile		profile
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
jobscript.sh		jobscript.sh
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Required software

Setup

Run

Input

General

Output

Modules

Available modules

Adding new module

About

Uh oh!

Releases

Packages

Languages

License

redwanfarooq/single_cell_multi

Folders and files

Latest commit

History

Repository files navigation

Description

Required software

Setup

Run

Input

General

Output

Modules

Available modules

Adding new module

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages