Skip to content

DiseaseNeuroGenomics/molecular_profiling

Repository files navigation

Molecular Profiling

This repository relates to the publication:
The cell type specific molecular landscape of schizophrenia

doi: TBA (will be updated upon publication)

Links between figure numbers and corresponding analytical scripts

Here is a breakdown of where the analyses for the individual figure panels are located.
Several panels (Figs. 1a, 3d, 3e, 4c, 6a, and S7a) are schematic illustrations created manually using graphics software or genome viewer and are therefore not generated by code.

Figure   Code Output (outputs) Description
Fig. 1 main.R
DEG.R, DAC.R
Fig_1_*.pdf Study design and characterization of generated RNA-seq and ATAC-seq data.
Fig. 2 main.R
DAC.R
Fig_2_*.pdf SCZ-associated changes in chromatin accessibility.
Fig. 3 main.R Fig_3_*.pdf Linking enhancer-promoter interactions and epigenetic dysregulation to SCZ GWAS signal.
Fig. 4 main.R
DEG.R
Fig_4_*.pdf SCZ-associated changes in gene expression.
Fig. 5 main.R
remacor.R
Fig_5_*.pdf SCZ-associated changes in transcript expression.
Fig. 6 main.R
qtl_analysis.R
Fig_6_*.pdf Cell-type-specific eQTL analysis and GWAS-eQTL colocalization.
Fig. S1 main.R Fig_S1_*.pdf Demographic and clinical characteristics of SCZ cases and controls.
Fig. S2 main.R Fig_S2_*.pdf Quality control for RNA-seq and ATAC-seq data.
Fig. S3 main.R
DEG.R, DAC.R
Fig_S3_*.pdf Analysis of variance in the RNA-seq and ATAC-seq data.
Fig. S4 main.R Fig_S4_*.pdf Comparison of our RNA-seq and ATAC-seq data with datasets previously generated using nuclei isolated by FANS.
Fig. S5 main.R Fig_S5_*.pdf Comparison of signal for marker genes in RNA-seq (a) and promoter OCRs of marker genes in ATAC-seq (b) across cell types.
Fig. S6 main.R Fig_S6_*.pdf Estimated cell type composition of SCZ cases and controls using the dTangle algorithm in RNA-seq (a) and ATAC-seq (b).
Fig. S7 main.R
gene_modules.R
Fig_S7_*.pdf Integrated co-expression and chromatin regulatory coupling across cell types.
Fig. S8 main.R
gene_modules.R
Fig_S8_*.pdf WGCNA module eigengene networks within each cell type.
Fig. S9 main.R Fig_S9_*.pdf Concordance between differential gene expression results from this study and those from PsychAD).
Fig. S10 qPCR_validation.R Fig_S10_*.pdf Validation of transcript-level RNA-seq findings by RT-qPCR in oligodendrocytes.

Repository configuration (important)

⚠️ Required configuration step

Most analysis scripts in this repository require manual configuration of the ROOT directory, which must point to the local path of the downloaded GitHub repository. This is necessary for correct resolution of input data, intermediate files, and output directories.

In each script, ROOT is defined at the very beginning in the CONFIG section and is clearly marked with comments such as:

# !!! FIXME: SET TO YOUR CUSTOM DIRECTORY !!!
ROOT <- "/path/to/MolecularProfiling"

Before running any script, users must update ROOT to their local repository path. Failure to do so will result in missing file or path errors. All scripts assume a consistent directory structure relative to ROOT, including inputs/, outputs/, and auxiliary resource folders.

Input data availability and setup (required)

⚠️ Required data download

This repository does not ship with the raw or processed input data used for the analyses. Before running any script, the inputs/ directory must be populated manually by downloading the study data from Synapse. Please download and untar the full contents of the following Synapse folder directly into the inputs/ directory:

Synapse ID: syn62787384
URL: https://www.synapse.org/Synapse:syn62787384

After extraction, the directory structure is expected to be:

MolecularProfiling/
├── inputs/
│   ├── <data files and subdirectories from Synapse>
├── outputs/
├── main.R
├── DEG.R
├── DAC.R
├── remacor.R
├── gene_modules.R
├── qtl_analysis/
└── helper_functions.R

All analysis scripts assume the exact filenames and subdirectory structure provided in the Synapse archive and resolve paths relative to ROOT/inputs. Failure to download and extract these files correctly will result in missing-file errors and incomplete figure generation.

Description of analytical scripts

DEG.R & DAC.R

Differential gene (DEG.R) and chromatin accessibility (DAC.R) analyses, including covariate exploration & selection (BIC, variance partitioning, tSNE) and downstream analysis of dysregulated OCRs and genes. The workflow is described in the Methods section Differential chromatin accessibility analysis and Differential gene expression analysis.

remacor.R

This file contains code and helper functions used to carry out the remacor analysis in Figure 5. The workflow is described in the Methods section Differential transcript analysis.

qtl_analysis

Folder contains scripts for qtl detection with MMQTL and analysis relevant to Figure 6. The workflow is described in the Methods sections QTL identification and Colocalization of eQTLs with SCZ risk loci:

  • 1_peer_factors.R: generates PEER factors using expression matrix
  • 2_prep_mmqtl_files.R: generates residualized matrix using PEER factors and bed file containing gene information
  • 3_generate_mmqtl_run_script.sh: generates scritps for running MMQTL per chromosome

gene_modules.R

Weighted gene co-expression network analysis and its integration with chromatin accessibility, generating Fig. S8–S9. The workflow is described in the Methods section Weighted gene co-expression network analysis and integration with chromatin accessibility.

qPCR_validation.R

Analysis of the RT-qPCR validation experiment, generating Fig. S10. Procedures are detailed in the Methods section RT-qPCR validation experiment.

helper_functions.R

Utility functions sourced by the analysis scripts, including routines for gene set enrichment analysis (GSEA) and for differential analysis (covariate exploration including BIC).

About

Code for Molecular Profiling manuscript

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors