Skip to content

Companion code for "Massively parallel genomic perturbations with multi-target CRISPR reveal new insights on Cas9 activity and DNA damage responses at endogenous sites."

License

Notifications You must be signed in to change notification settings

rogerzou/multitargetCRISPR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analysis software for multi-target CRISPR

Companion code for:

Zou, R.S., Marin-Gonzalez, A., Liu, Y., Liu, H.B., Shen, L., Dveirin, R., Luo, J.X., Kalhor, R. and Ha, T. Massively parallel genomic perturbations with multi-target CRISPR reveal new insights on Cas9 activity and DNA damage responses at endogenous sites. bioRxiv (2022). https://www.biorxiv.org/content/10.1101/2022.01.18.476836v1

Software requirements

  • Anaconda Python 3.7 (Anaconda's python distribution comes with the required numpy and scipy libraries)
  • pysam
  • bowtie2
  • samtools
  • Ensure that both samtools and bowtie2 are added to path and can be called directly from bash

Data requirements

Installation

  1. Download sequencing reads in FASTQ format from SRA
  2. Download the prebuilt bowtie2 indices for human hg19 and hg38 genome assemblies
    • Human hg38
    • Human hg19
    • Extract from archive, move to the corresponding folders named hg38_bowtie2/ and hg19_bowtie2/
  3. Download two human hg19 and hg38 genome assemblies in FASTA format
    • hg38.fa
    • hg19.fa
    • Extract from archive, move to the corresponding folders named hg38_bowtie2/ and hg19_bowtie2/
  4. Generate FASTA file indices
    • samtools faidx hg38_bowtie2/hg38.fa
    • samtools faidx hg19_bowtie2/hg19.fa

Usage

Bash scripts are used to automate the processing of sequencing data.

Python scripts are used to perform analysis of various data featured in the manuscript. They are labeled script_*_*.py, such as script_1_putative.py

List of mgRNA sequences

List of target sites for the 'GG', 'CT', and 'TA' mgRNA sequences, along with its hg38 genomic context, are included in mgRNA_target_sites.xlsx.

About

Companion code for "Massively parallel genomic perturbations with multi-target CRISPR reveal new insights on Cas9 activity and DNA damage responses at endogenous sites."

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •