msc_dissertationproject

Fungal WGS ONT bioinformatics pipeline

Bioinformatics Pipeline - Fungal WGS ONT Bioinformatics Pipeline

#Overview

This repository contains scripts for running a comprehensive bioinformatics pipeline for fungal whole-genome sequencing using Oxford Nanopore Technology (ONT). The pipeline performs quality control&filtering, whole genome assembly, genome annotation, phylogenetic analysis, and functional annotations.

#Python Script

pipeline.py: The main Python script that sequentially executes the entire bioinformatics pipeline that can be run using Python

pipeline.slurm: The integrated SLURM script that executes all steps of the bioinformatics pipeline that can be run in terminal

Quality Control: Perform quality control using NanoPlot.
Data Preprocessing: Preprocess raw data using Chopper.
Genome Assembly: Assemble genomes using Flye.
Repeat Identification and Masking: Identify and mask repeats using RepeatModeler and RepeatMasker.
Genome Annotation: Annotate genomes using BRAKER2.
Quality Assessment: Assess genome assembly quality using QUAST.
Genome Completeness: Evaluate genome completeness using BUSCO.
Orthologous Gene Identification: Identify orthologous genes using OrthoFinder.
Phylogenetic Analysis: Perform phylogenetic analysis using MUSCLE, trimAl, and IQ-TREE.
Functional Annotations: Conduct functional annotations using AntiSMASH and dbCAN.
Processing Publicly Available FASTA Files: Process downloaded FASTA files, including quality assessment, repeat masking, annotation, and completeness evaluation.

#Usage

Ensure you have the required software and dependencies installed. This pipeline is designed to run on a SLURM-managed cluster or using a Python environment with the necessary bioinformatics tools installed.

#Prerequisites

SLURM
Python 3.x
Conda environments for specific bioinformatics tools
Bioinformatics tools: NanoPlot, Chopper, Flye, QUAST, BUSCO, BRAKER2, RepeatModeler, RepeatMasker, OrthoFinder, MUSCLE, trimAl, IQ-TREE, AntiSMASH, dbCAN

Running the Pipeline

Set Up Directories: Ensure all necessary directories are created. The script will handle this automatically.
Prepare Input Files: Place your input files in the appropriate directories. Running the Python Script

To run the entire pipeline using Python:

$python3 pipeline.py Running the SLURM Script

To run the entire pipeline, submit the integrated SLURM script:

$sbatch scripts/pipeline.slurm This README.md file provides a clear overview of the pipeline, the steps involved, and instructions on how to run the scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.DS_Store		.DS_Store
README.md		README.md
pipeline.py		pipeline.py
pipeline.slurm		pipeline.slurm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

msc_dissertationproject

Running the Pipeline

About

Uh oh!

Releases

Packages

Languages

pwwongaa/msc_dissertationproject

Folders and files

Latest commit

History

Repository files navigation

msc_dissertationproject

Running the Pipeline

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages