Skip to content

rboz1/chipseq_nextflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LinkedIn

Multi-omic Analysis of Gene Regulation in Breast Cancer

A comprehensive ChIP-seq peak calling and downstream analysis workflow implemented in Nextflow, with differential expression analysis integration.

Table of Contents
  1. About The Script
  2. Getting Started
  3. Results
  4. Contact

About The Project

A comprehensive ChIP-seq peak calling and downstream analysis workflow implemented in Nextflow, integrating peak detection, annotation, motif discovery, and functional enrichment with RNA-seq differential expression analysis for biological insights.

Key Features

  • Adapter trimming with Trimmomatic
  • Quality control with FastQC and MultiQC
  • Read alignment using Bowtie2
  • Peak calling with MACS3
  • Blacklist filtering using ENCODE
  • Peak annotation and motif discovery with HOMER
  • Signal quantification and visualization using deepTools
  • Differential expression analysis integration using custom Python script

Built With


Getting Started

Prerequisites

  • Nextflow
  • Installed tools and reference data (genome FASTA, annotation files, adapters, etc.)

Installation

  1. Clone the repo
    git clone https://github.com/rboz1/chipseq_nextflow.git
  2. Create and activate conda environment
    conda env create -f base_env.yml
    conda activate nextflow_base
    
  3. Run pipeline
    nextflow run main.nf -profile conda,cluster
    
    ** please update your cluster in the nextflow.config if it isn't sge **
    

(back to top)


Results

  • Enrichment analysis:
    Protein-coding genes associated with high-scoring peaks (>1000) were submitted to EnrichR for functional enrichment. Analysis focused on:

    • GO Biological Processes: Revealed significant enrichment in pathways related to positive regulation of signal transduction and the Notch signalling pathway - a critical developmental signaling cascade often dysregulated in cancer.
    • ChEA 2022 Transcription Factor Targets: Identified key transcription factors potentially regulating these genes with RUNX1 prominently enriched.
    enrichr_bar_chart
  • Motif discovery results:
    Identified sequence motifs and candidate transcription factors from peak regions, with RUNX1 motifs frequently observed, reinforcing its biological relevance alongside enrichment analysis results.

    motif_screenshot
  • Integration with RNA-seq data:
    By combining RNA-seq differential expression data with RUNX1 ChIP-seq peak annotations, we analyzed the proximity of RUNX1 binding sites relative to the transcription start site (TSS) of upregulated and downregulated genes.

    • A significant proportion of upregulated genes show RUNX1 binding within ±5 kb and ±20 kb of their TSS compared to downregulated genes.
    • This suggests RUNX1 binding near TSS correlates with gene activation, supporting its role as a key regulator in this context.
    • The data were visualized as a stacked bar plot illustrating the percentage of genes bound or not bound by RUNX1 near their TSS in these gene sets.
    Screenshot 2025-07-29 at 12 25 07 PM

    (back to top)


Next steps:

  • Integrating ChIP-seq and RNA-seq data specifically for Notch pathway genes to better understand the regulatory impact of RUNX1 binding.
  • Expanding analysis to assess co-binding patterns between RUNX1 and other transcription factors to uncover potential regulatory complexes.

Contact

Rachel - rbozadjian@gmail.com

(back to top)

About

A comprehensive ChIP-seq peak calling and downstream analysis workflow implemented in Nextflow, with differential expression analysis integration.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors