This repository relates to the publication:
The cell type specific molecular landscape of schizophrenia
doi: TBA (will be updated upon publication)
Here is a breakdown of where the analyses for the individual figure panels are located.
Several panels (Figs. 1a, 3d, 3e, 4c, 6a, and S7a) are schematic illustrations created manually using graphics software or genome viewer and are therefore not generated by code.
| Figure | Code | Output (outputs) |
Description |
|---|---|---|---|
| Fig. 1 | main.RDEG.R, DAC.R |
Fig_1_*.pdf |
Study design and characterization of generated RNA-seq and ATAC-seq data. |
| Fig. 2 | main.RDAC.R |
Fig_2_*.pdf |
SCZ-associated changes in chromatin accessibility. |
| Fig. 3 | main.R |
Fig_3_*.pdf |
Linking enhancer-promoter interactions and epigenetic dysregulation to SCZ GWAS signal. |
| Fig. 4 | main.RDEG.R |
Fig_4_*.pdf |
SCZ-associated changes in gene expression. |
| Fig. 5 | main.Rremacor.R |
Fig_5_*.pdf |
SCZ-associated changes in transcript expression. |
| Fig. 6 | main.Rqtl_analysis.R |
Fig_6_*.pdf |
Cell-type-specific eQTL analysis and GWAS-eQTL colocalization. |
| Fig. S1 | main.R |
Fig_S1_*.pdf |
Demographic and clinical characteristics of SCZ cases and controls. |
| Fig. S2 | main.R |
Fig_S2_*.pdf |
Quality control for RNA-seq and ATAC-seq data. |
| Fig. S3 | main.RDEG.R, DAC.R |
Fig_S3_*.pdf |
Analysis of variance in the RNA-seq and ATAC-seq data. |
| Fig. S4 | main.R |
Fig_S4_*.pdf |
Comparison of our RNA-seq and ATAC-seq data with datasets previously generated using nuclei isolated by FANS. |
| Fig. S5 | main.R |
Fig_S5_*.pdf |
Comparison of signal for marker genes in RNA-seq (a) and promoter OCRs of marker genes in ATAC-seq (b) across cell types. |
| Fig. S6 | main.R |
Fig_S6_*.pdf |
Estimated cell type composition of SCZ cases and controls using the dTangle algorithm in RNA-seq (a) and ATAC-seq (b). |
| Fig. S7 | main.Rgene_modules.R |
Fig_S7_*.pdf |
Integrated co-expression and chromatin regulatory coupling across cell types. |
| Fig. S8 | main.Rgene_modules.R |
Fig_S8_*.pdf |
WGCNA module eigengene networks within each cell type. |
| Fig. S9 | main.R |
Fig_S9_*.pdf |
Concordance between differential gene expression results from this study and those from PsychAD). |
| Fig. S10 | qPCR_validation.R |
Fig_S10_*.pdf |
Validation of transcript-level RNA-seq findings by RT-qPCR in oligodendrocytes. |
Most analysis scripts in this repository require manual configuration of the ROOT directory, which must point to the local path of the downloaded GitHub repository. This is necessary for correct resolution of input data, intermediate files, and output directories.
In each script, ROOT is defined at the very beginning in the CONFIG section and is clearly marked with comments such as:
# !!! FIXME: SET TO YOUR CUSTOM DIRECTORY !!!
ROOT <- "/path/to/MolecularProfiling"
Before running any script, users must update ROOT to their local repository path. Failure to do so will result in missing file or path errors. All scripts assume a consistent directory structure relative to ROOT, including inputs/, outputs/, and auxiliary resource folders.
This repository does not ship with the raw or processed input data used for the analyses. Before running any script, the inputs/ directory must be populated manually by downloading the study data from Synapse. Please download and untar the full contents of the following Synapse folder directly into the inputs/ directory:
Synapse ID: syn62787384
URL: https://www.synapse.org/Synapse:syn62787384
After extraction, the directory structure is expected to be:
MolecularProfiling/
├── inputs/
│ ├── <data files and subdirectories from Synapse>
├── outputs/
├── main.R
├── DEG.R
├── DAC.R
├── remacor.R
├── gene_modules.R
├── qtl_analysis/
└── helper_functions.R
All analysis scripts assume the exact filenames and subdirectory structure provided in the Synapse archive and resolve paths relative to ROOT/inputs.
Failure to download and extract these files correctly will result in missing-file errors and incomplete figure generation.
Differential gene (DEG.R) and chromatin accessibility (DAC.R) analyses, including covariate exploration & selection (BIC, variance partitioning, tSNE) and downstream analysis of dysregulated OCRs and genes. The workflow is described in the Methods section Differential chromatin accessibility analysis and Differential gene expression analysis.
This file contains code and helper functions used to carry out the remacor analysis in Figure 5. The workflow is described in the Methods section Differential transcript analysis.
Folder contains scripts for qtl detection with MMQTL and analysis relevant to Figure 6. The workflow is described in the Methods sections QTL identification and Colocalization of eQTLs with SCZ risk loci:
- 1_peer_factors.R: generates PEER factors using expression matrix
- 2_prep_mmqtl_files.R: generates residualized matrix using PEER factors and bed file containing gene information
- 3_generate_mmqtl_run_script.sh: generates scritps for running MMQTL per chromosome
Weighted gene co-expression network analysis and its integration with chromatin accessibility, generating Fig. S8–S9. The workflow is described in the Methods section Weighted gene co-expression network analysis and integration with chromatin accessibility.
Analysis of the RT-qPCR validation experiment, generating Fig. S10. Procedures are detailed in the Methods section RT-qPCR validation experiment.
Utility functions sourced by the analysis scripts, including routines for gene set enrichment analysis (GSEA) and for differential analysis (covariate exploration including BIC).