Skip to content
/ QBLAST Public

Quantitative analysis of genomic features with nucleotide BLAST

Notifications You must be signed in to change notification settings

sofya-d/QBLAST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

QBLAST: Quantitative analysis of genomic features with NCBI BLAST

What does this pipeline do?

This pipeline aligns input query sequences of genomic features on input genomes and produces tables and plot with query sequences number in genomes.

Output data

  • BLAST output files with default filtering parameters values
  • BED files filtered with all combinations of alignment length and BLAST e-value parameters values (0.01, 1e-15 and 0.75, 0.90, 0.95 respectively)
  • Summary TSV tables for each combination of parameters values where rows are query sequences, columns are scaffolds/contigs, and values in cells are number of specific query sequence in specific scaffold/contig.
  • Heat map plot in SVG format were rows are input genomes, columns are input query sequences, and values in cells are number of specific query sequence in whoe genome.

Prerequisites

  • Unix OS
    This pipeline was tested with Ubuntu 20.04.3 LTS.
  • NCBI BLAST command line tool
    This pipeline was tested with version 2.13.0+. NCBI blast must be added to PATH variable.
  • Python
    This pipeline was tested with version 3.10.10.
  • Biopython Python module
    This pipeline was tested with version 1.79.
  • Matplotlib Python module
    This pipeline was tested with version 3.7.1.
  • NumPy Python module
    This pipeline was tested with version 1.24.3.
  • pandas Python module
    This pipeline was tested with version 1.5.3.
  • seaborn Python module
    This pipeline was tested with version 0.12.2.

Installation

Download and inflate archive with pipeline via GitHub GUI or if you have Git installed paste the following command in shell:

git clone https://github.com/sofya-d/QBLAST

Input

  1. Path to directory with genome FASTA files. Files must have '.fasta' extension
  2. Path to directory where BLAST databases will be written
  3. Path to directory with query sequence FASTA files. Files must have '.fasta' extension
  4. Path to directory with script files of this pipeline
  5. Path to output directory

Pipeline running example

./1_run_blast.sh ./genomes ./databases ./queries ./ ./output

About

Quantitative analysis of genomic features with nucleotide BLAST

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published