This workflow is designed to align, perform basic QC and call peaks for peak-based methods such as ChIP-seq, CUT&RUN and ATAC-seq.
-
git clone <repo> <new_directory_name> -
Put the 'fastq.gz' files in the
raw_data/subdirectory. You may use symlinks. -
Look over the config file to ensure that the correct indexes and other species-specific settings are used.
-
Set up the sample sheet (samples.tsv). You may find it helpful to run
./make_samples_template.shfrom inside thebin/subdirectory to get a template file based on the files inraw_data/. The columns are:i.
sample-- Name of the sample. Fastqs will be renamed to this. You may use the same 'sample' name in multiple rows in this file to represent fastqs that should becattogether.ii.
control-- 'sample' name for the control to be used, for example, as control for MACS2. If not applicable, use 'NA'. Leaving it blank should also be fine.iii.
fq1-- R1 fileiv.
fq2-- R2 file; fill in with NA if SE data.v.
sample_group-- Grouping variable for replicates.vi.
enriched_factor-- Samples with the sameenriched_factorwill be normalized together with CSAW, if alternative normalization is requested in the config file. -
From the root directory of the project, run
sbatch bin/run_snakemake.sh.