A comprehensive ChIP-seq peak calling and downstream analysis workflow implemented in Nextflow, with differential expression analysis integration.
Table of Contents
A comprehensive ChIP-seq peak calling and downstream analysis workflow implemented in Nextflow, integrating peak detection, annotation, motif discovery, and functional enrichment with RNA-seq differential expression analysis for biological insights.
- Adapter trimming with Trimmomatic
- Quality control with FastQC and MultiQC
- Read alignment using Bowtie2
- Peak calling with MACS3
- Blacklist filtering using ENCODE
- Peak annotation and motif discovery with HOMER
- Signal quantification and visualization using deepTools
- Differential expression analysis integration using custom Python script
- Nextflow
- Installed tools and reference data (genome FASTA, annotation files, adapters, etc.)
- Clone the repo
git clone https://github.com/rboz1/chipseq_nextflow.git
- Create and activate conda environment
conda env create -f base_env.yml conda activate nextflow_base - Run pipeline
nextflow run main.nf -profile conda,cluster ** please update your cluster in the nextflow.config if it isn't sge **
-
Enrichment analysis:
Protein-coding genes associated with high-scoring peaks (>1000) were submitted to EnrichR for functional enrichment. Analysis focused on:- GO Biological Processes: Revealed significant enrichment in pathways related to positive regulation of signal transduction and the Notch signalling pathway - a critical developmental signaling cascade often dysregulated in cancer.
- ChEA 2022 Transcription Factor Targets: Identified key transcription factors potentially regulating these genes with RUNX1 prominently enriched.
-
Motif discovery results:
Identified sequence motifs and candidate transcription factors from peak regions, with RUNX1 motifs frequently observed, reinforcing its biological relevance alongside enrichment analysis results.
-
Integration with RNA-seq data:
By combining RNA-seq differential expression data with RUNX1 ChIP-seq peak annotations, we analyzed the proximity of RUNX1 binding sites relative to the transcription start site (TSS) of upregulated and downregulated genes.- A significant proportion of upregulated genes show RUNX1 binding within ±5 kb and ±20 kb of their TSS compared to downregulated genes.
- This suggests RUNX1 binding near TSS correlates with gene activation, supporting its role as a key regulator in this context.
- The data were visualized as a stacked bar plot illustrating the percentage of genes bound or not bound by RUNX1 near their TSS in these gene sets.
- Integrating ChIP-seq and RNA-seq data specifically for Notch pathway genes to better understand the regulatory impact of RUNX1 binding.
- Expanding analysis to assess co-binding patterns between RUNX1 and other transcription factors to uncover potential regulatory complexes.
Rachel - rbozadjian@gmail.com