Skip to content

Bayesian framework for identifying expression-driven genetic dependencies across cancer cell lineages. BEACON integrates JAGS-based MCMC inference with pan-cancer expression and CRISPR dependency data to reveal precision oncology targets.

License

Notifications You must be signed in to change notification settings

Huang-lab/BEACON

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BEACON

Overview

BEACON is a computational framework designed to perform Bayesian analysis on expression and gene dependency data across different cell lineages. The model integrates expression as the dependent variable and gene dependency data as the independent variable.

The framework uses JAGS (Just Another Gibbs Sampler) for Bayesian inference and MCMC (Markov Chain Monte Carlo) simulations to estimate parameters of interest, including the correlation between gene expression and dependency.

Citation

Elmas A, Layden HM, Ellis JD, Bartlett LN, Zhao X, Kawabata-Iwakawa R, Obinata H, Hiebert SW, Huang KL. Expression-Driven Genetic Dependency Reveals Targets for Precision Medicine. bioRxiv [Preprint]. 2024 Oct 21:2024.10.17.618926. doi: 10.1101/2024.10.17.618926. PMID: 39484404; PMCID: PMC11527036. DOI

Features

  • Flexible Data Input: Supports multiple data types such as mRNA, Protein, and RNA transcripts.
  • Customizable Parameters: Users can adjust parameters like number of iterations, adaptation steps, and lineages of interest.
  • Reproducibility: The code can reproduce results or calculate false discovery rates (FDR) based on user inputs via setting random seeds ("set.seed" and ".RNG.seed"); Generates detailed output files with Bayesian analysis results for each lineage.

Installation

  1. Clone the repository:

  2. Install the necessary R packages:

    install.packages(c("openxlsx", "rjags"))
  3. Install JAGS:

  4. Download required files and setup folders and file names (with proper suffixes indicating data release, e.g., "22Q2"):

    BEACON-main/
    ├── LineageMCMC.R
    ├── PanLineageMCMC.R
    ├── DepMap_data/
    │   ├── sample_info_22Q2.csv  
    │   ├── CCLE_expression_22Q2.csv.gz
    │   └── CRISPR_gene_effect_22Q2.csv.gz
    ├── QuantProtCCLE_Nusinow_Cell2020/  
    │   └── mmc2.xlsx  
    └── out/
    

Computational Requirements

Runtime: Calculating panlineage mRNA correlations for 12619 genes takes approximately 50.7 hours (14.4 seconds per gene) on a 8-core processor with 32 GB memory (OS: x86_64-pc-linux-gnu, 64-bit), and it takes 9.3 hours per lineage (on average).

System Requirements:

  • R version 4.2.0 (2022-04-22) or later
  • See requirements.txt for complete package versions and dependencies

Usage

To run the analysis, modify the R scriptS according to your data and parameters. The primary script performs the following steps:

  1. Data Preparation:

    • Load mRNA/protein expression and CRISPR dependency data.
    • Compress the expression and the dependency data (gzip CCLE_expression.csv | gzip CRISPR_gene_effect.csv).
    • Map and filter data based on lineage and gene selection.
  2. Model Initialization:

    • Initialize the Bayesian model with uninformative priors.
  3. Run MCMC:

    • Perform MCMC simulations for each lineage.
    • Save results to an output directory.
  4. Reproducibility:

    • Optionally reproduce previous results by loading existing data and recalculating FDR.

Example

# Example of running the analysis
n.adapt = 200
n.update = 200
n.iter = 1000
reproduce.results = TRUE

# Load and prepare mRNA data
data = 'mRNA'
cell.type = 'All'
panel = ''

# Run the Bayesian analysis for a specific lineage
lineage='SOFT.TISSUE'
# Modify other parameters as necessary
# ...

# Run the analysis
source('LineageMCMC.R')
source('PanLineageMCMC.R')

Output

The analysis generates the following output files:

  • Table.<data>.dependency.Bayesian.lineage.<lineage>.<panel>.xlsx: Summary of Bayesian analysis for each lineage.
  • Table.<data>.dependency.Bayesian.pancancer.<panel>.xlsx: Summary of Bayesian panlineage analysis.
  • Log files and intermediate results saved in the specified output directory.

About

Bayesian framework for identifying expression-driven genetic dependencies across cancer cell lineages. BEACON integrates JAGS-based MCMC inference with pan-cancer expression and CRISPR dependency data to reveal precision oncology targets.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages