Skip to content

realcann/pipeline_long_read

Repository files navigation

Pipeline for raw long - read data

This project implements a reproducible Nextflow pipeline for quality control and statistical analysis of long-read sequencing data.

The pipeline takes a FASTQ file as input, performs QC analysis, calculates per-read statistics, and generates visualizations and summary statistics.

Features

  • FastQC for general quality control
  • NanoPlot as a long-read-specific QC tool
  • Per-read calculation of:
    • GC content (%)
    • read length
    • mean read quality score
  • CSV output
  • Distribution plots
  • Summary statistics

Tree scheme

. ├── README.md ├── email_draft.md ├── environment.yml ├── files │   └── barcode77.fastq ├── main.nf ├── nextflow.config ├── out ├── process │   ├── QC │   │   ├── fastqc.nf │   │   └── nanoplot.nf │   ├── STATS │   │   └── read_stats.nf │   └── VISUALIZATION │   └── visualize.nf ├── scripts │   ├── read_stats.py │   └── visualize_stats.py └── workflow └── QC └── qc_workflow.nf

10 directories, 13 files

Installations

Clone the repository:

git clone https://github.com/realcann/pipeline_long_read.git
cd pipeline_long_read


## Create the Conda environment

conda env create -f environment.yml
conda activate qc_pipeline_env

## Run the pipeline

nextflow run main.nf

## Output

All results are written to the out/ directory.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors