scCNASim is a python package designed for simulation of allele-specific somatic copy number alterations (CNAs) from single-cell and spatial transcriptomics. It mainly takes existing alignment file, phased SNPs, and a clonal CNA profile as input, and outputs new alignments with designated signals of CNAs and clonal structure.
The core idea involves processing haplotype-specific reads separately, including fitting and simulating haplotype-specific gene expression counts, followed by UMI (read) sampling.
Release notes are at docs/release.rst.
Currently, only Python 3.11 (compatible) and 3.7 (not compatible) were tested. Therefore, we strongly recommend to install the package with Python >= 3.11.
- Python >= 3.11
pip install -U git+https://github.com/hxj5/scCNASimIf you encounter an error
"configure: error: liblzma development files not found"
when installing scCNASim, it is actually an installation issue of pysam.
You can fix the error easily by installing pysam via conda, if you are installing scCNASim in an conda env, i.e., run
conda config --add channels bioconda
conda config --add channels conda-forge
conda install pysamand then re-install scCNASim. See Issue 3 for details.
The full manual is at docs/manual.rst.
For troubleshooting, please have a look of docs/FAQ.rst, and we welcome reporting any issue for bugs, questions and new feature requests.
The simulator has a precursor named scCNASimulator, which has been used in XClone to demonstrate its robustness to detect allele-specific CNAs.
scCNASimulator implements a naive strategy for CNA simulation, which multiplies the UMI/read counts directly by copy ratio to generate the new counts of CNA features, whereas this new simulator models the counts with certain probability distribution and encodes the copy ratio in the updated distribution parameters before generating new simulated counts.