Skip to content

cheeseman-lab/nebo-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Brieflow Analysis Template

Template repository for storing processing optical pooled screen data with Brieflow.

Set Up

This repository is designed to work with Brieflow to analyze optical pooled screens. Follow these steps to get set up for a screen analysis!

1. Screen Analysis Repository Setup

Brieflow-analysis is a template for each screen analysis. Create a new respository for a screen to get started.

  1. Create a new screen repository wih the "Use this template" button for each new screen analysis.

use template

  1. Clone the newly created repository to your local machine:
git clone https://github.com/your-username/your-screen-repository.git
cd your-screen-repository
  1. Optional: Add template brieflow-analysis template as an upstream reference in screen repository: git remote add template https://github.com/cheeseman-lab/brieflow-analysis

See the GitHub documentation for using a template for more information.

2. Brieflow Setup

We use Brieflow to process data on a very large scale from each screen. Note: We use Brieflow as a git submodule in this repository. Please see the Git Submodules basic explanation for information on how to best install, use, and update this submodule.

  1. Clone the Brieflow package into this repo using the following git submodule commands:
# init git submodule
git submodule init
# clone brieflow
git submodule update
  1. Set up Brieflow following the setup instructions. To set up the Brieflow Conda environment (~20 min):
# enter brieflow directory
cd brieflow/
# create brieflow_main_env conda environment
conda env create --file=brieflow_main_env.yml
# activate brieflow_main_env conda environment
conda activate brieflow_main_env
# set conda installation to use strict channel priorities
conda config --set channel_priority strict

Note: We recommend making a custom Brieflow environment if you need other packages for Brieflow modifications. Simply change the name of the brieflow_main_env Conda environment and track your added packages in brieflow/brieflow_main_env.yml.

We use the HPC integration for Slurm as detailed in the setup instructions. To use the Slurm integration for Brieflow configure the Slurm resources in analysis/slurm/config.yaml.

  1. Optional: Track changes to computational processing in a new branch on your fork. Contribute these changes to the main version of Brieflow with a PR as described in the Brieflow contribution notes.

3. Start Analysis

analysis/ contains configuration notebooks used to configure processes and slurm scripts used to run full modules. By default, results are output to analysis/analysis_root and organized by analysis module (preprocess, sbs, phenotype, etc).

Follow the full instructions below to run an analysis.

Analysis Steps

Follow the instructions below to configure parameters and run modules. All of these steps are done in the example analysis. Use the following commands to enter this folder and activate the conda env:

# enter analysis directory
cd analysis/
# activate brieflow_main_env conda environment
conda activate brieflow_main_env

*Notes:

  • Use brieflow_main_env Conda environment for each configuration notebook.
  • How you use brieflow should depend on your workload.
    • Runs that can be done with local compute can be run with the .sh scripts, which are set up to run all rules for a module. Note that these scripts are currently set up to do a dry run with the -n parameter, which will need to be removed for a local run`.
    • Runs that need HPC compute should be run with the _slurm.sh scripts. Right now, these are set up to log run information and break the larger steps (preprocessing, sbs, phenotype) into plate-level runs. The local .sh scripts can still be used to do a dry run preview with -n (already set up).

Step 0: Configure preprocess parameters

Follow the steps in 0.configure_preprocess_params.ipynb to configure preprocess params.

Note: This step determines where ND2 data is loaded from (can be from anywhere) and where intermediate/output data is saved (can also be anywhere). By default, results are output to analysis/analysis_root.

Step 1: Run preprocessing module

Local:

sh 1.run_preprocessing.sh

Slurm:

Change NUM_PLATES in 1.run_preprocessing_slurm.sh to the number of plates you are processing (to process each plate separately).

# start a tmux session: 
tmux new-session -s preprocessing
# in the tmux session:
bash 1.run_preprocessing_slurm.sh

*Note: For testing purposes, users may only have generated sbs or phenotype images. It is possible to test only SBS/phenotype preprocessing in this notebook. See notebook instructions for more details.

Step 2: Configure SBS parameters

Follow the steps in 2.configure_sbs_params.ipynb to configure SBS module parameters.

Step 3: Configure phenotype parameters

Follow the steps in 3.configure_phenotype_params.ipynb to configure phenotype module parameters.

Step 4: Run SBS/phenotype modules

Local:

sh 4.run_sbs_phenotype.sh

Slurm:

Change NUM_PLATES 4a.run_sbs_slurm.sh and 4b.run_phenotype_slurm.sh to the number of plates you are processing (to process each plate separately). These two modules can be run simultaneously or separately.

# start a tmux session: 
tmux new-session -s sbs_phenotype
# in the tmux session:
bash 4a.run_sbs_slurm.sh
bash 4b.run_phenotype_slurm.sh

Step 5: Configure merge process params

Follow the steps in 5.configure_merge_params.ipynb to configure merge process params.

Step 6: Run merge process

Local:

sh 6.run_merge.sh

Slurm:

# start a tmux session: 
tmux new-session -s merge
# in the tmux session:
bash 6.run_merge_slurm.sh

Step 7: Configure aggregate process params

Follow the steps in 7.configure_aggregate_params.ipynb to configure aggregate process params.

Step 8: Run aggregate process

Local:

sh 8.run_aggregate.sh

Slurm:

# start a tmux session: 
tmux new-session -s aggregate
# in the tmux session:
bash 8.run_aggregate_slurm.sh

Step 9: Configure cluster process params

Follow the steps in 9.configure_cluster_params.ipynb to configure cluster process params.

Step 10: Run cluster process

Local:

bash 10.run_cluster.sh

Slurm:

# start a tmux session: 
tmux new-session -s cluster
# in the tmux session:
bash 10.run_cluster_slurm.sh

*Note: Many users will want to only run SBS or phenotype processing, independently. It is possible to restrict the SBS/phenotype processing with the following:

  1. If either of the sample dataframes defined in 0.configure_preprocess_params.ipynb are empty then no samples will be processed. See the notebook for more details.
  2. By varying the tags in the 4.run_sbs_phenotype sh files (--until all_sbs or --until all_phenotype), the analysis will only run only the analysis of interest.

Generate Rulegraph

Run the following script to generate a rulegraph of Brieflow:

sh generate_rulegraph.sh

Contributing

  • Core improvements should be contributed back to Brieflow
  • If you have analyzed any of your optical pooled screening data using brieflow-analysis, please reach out and we will include you in the table below!

Examples of brieflow-analysis usage:

Study Description Analysis Repository Publication
Coming soon

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published