Skip to content

seqwell/seqWell-ONT-plasmid-assembly

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

seqWell-ONT-plasmid-assembly

Nextflow Workflow Tests Nextflow

This Nextflow pipeline assembles plasmids from Oxford Nanopore reads. It computes per-sample quality metrics to automatically derive plasmid length estimates, runs Autocycler on each sample in parallel, and then annotates, maps, and quality-checks assembled sequences to produce final plasmid maps and coverage reports.

Pipeline Overview

The pipeline processes ONT sequencing data through the following key steps:

NANOSTAT: computes per-sample N50 value, which is used as the plasmid length estimate for assembly.

AUTOCYCLER: assembles each sample using the N50-derived plasmid length; samples are processed in parallel. This process is a lightweight Nextflow DSL2 wrapper around the Autocycler tool for long-read plasmid assembly.

MINIMAP2: aligns input FASTQ reads to the assembled consensus FASTA and produces a BAM file and a position-by-position coverage depth file.

PYSAMSTATS: computes per-base read-depth and nucleotide statistics from the BAM file.

PLOT_COVERAGE: generates a coverage plot from the position-by-position depth file produced by the Minimap2 step.

PLANNOTATE: generates a GenBank (.gbk) annotation file for each assembled plasmid FASTA.

PLASMIDMAP: creates plasmid maps from the collected GenBank annotation files.

Pipeline Overview

Dependencies

This pipeline requires installation of:

  • Nextflow: Workflow management system
  • Docker: Containerization platform for running pipeline processes

Docker Containers

All Docker containers used in this pipeline are publicly available:

Process Container
NANOSTAT quay.io/biocontainers/nanostat:1.6.0--pyhdfd78af_0
AUTOCYCLER seqwell/autocycler:0.5.2
MINIMAP2 quay.io/biocontainers/minimap:22.28--he4a0461_2
PYSAMSTATS quay.io/biocontainers/pysamstats:1.1.2--py311h384fd50_15
PLANNOTATE seqwell/fq_assemble:v1.0
PLASMIDMAP cautree/plasmidmap:latest
PLOT_COVERAGE seqwell/python:v2.0

How to Run the Pipeline

Required Parameters

The pipeline requires the following parameters:

--input Path to the directory containing .fastq.gz files.

--outdir The output directory path where results will be saved.

Both input and output can be a local absolute path or an AWS S3 URI. If it is an AWS S3 URI, please make sure to set your security credentials appropriately in the nextflow.config file.

Profiles

Several profiles are available and can be selected with the -profile option:

  • test: Run pipeline using test data.
  • docker: Run pipeline using Docker containers (default). No need to specify -profile docker, as this is the default profile.
  • awsbatch: Run pipeline on AWS Batch

Example Command

nextflow run main.nf \
    --input "${PWD}/path/to/fastq_dir" \
    --outdir "${PWD}/path/to/output" \
    -resume -bg

Running Test Data

The pipeline can be run using test data with:

nextflow run main.nf \
    --input "${PWD}/tests/example_fastqs" \
    --outdir "${PWD}/test_output" \
    -resume -bg

Expected Outputs

├── assembled_fasta
│   ├── BC_01.trimmed_consensus_assembly.fasta    ## Assembled consensus FASTA
│   └── BC_02.trimmed_consensus_assembly.fasta
├── COVERAGE
│   ├── BC_01.trimmed_depth_plot.png              ## Coverage plot
│   └── BC_02.trimmed_depth_plot.png
├── GBK
│   ├── BC_01.trimmed.gbk                         ## GenBank annotation file
│   └── BC_02.trimmed.gbk
├── nanostat
│   ├── BC_01.trimmed_n50.csv                     ## N50 from NanoStat, used as plasmid length estimate
│   ├── BC_01.trimmed_nanostat.txt                ## Full NanoStat report
│   ├── BC_02.trimmed_n50.csv
│   └── BC_02.trimmed_nanostat.txt
├── per_base_data_out
│   ├── BC_01.trimmed.per.base.csv                ## Per-base report from pysamstats
│   └── BC_02.trimmed.per.base.csv
└── plasmidmap
    ├── BC_01.trimmed.plasmidMap.png              ## Plasmid map
    └── BC_02.trimmed.plasmidMap.png

About

Assembles plasmids from Oxford Nanopore reads generated using SeqWell-384 kits

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors