uniortho

Pipeline to make strain specific primers for your strain of interest based on unique orthogroups

Uniortho is a standalone bash script developed by using bashly. It uses the pangenome analysis tool SCARAP and skani for average nucleotide identity calculation.

Installing uniortho

The easiest way to install the dependencies needed for uniortho is to clone this repository and make use of a conda environment. Make sure that you have conda installed.

Clone this repository

git clone https://github.com/TomEile/uniortho.git
cd uniortho

A script containing instructions on installing the dependencies is included in the setup directory. Running it as such creates a new conda environment called 'scarap':

chmod +x ./setup/install_conda_env.sh
./setup/install_conda_env.sh

After dependencies have installed the script can be used while the environment is active:

conda activate scarap
uniortho run

For a relatively quick test to see if all dependencies of the tool are working as expected, you could run:

uniortho test

This should create a test_out directory with the following files and directories:

test_out/
├── ani
├── faas
├── fetch
├── ffns
├── fnas
├── pan
├── pangenome_count.tsv
└── uniquegenes.tsv

Usage

To get a detailed overview of the parameters run:

uniortho run -h

uniortho run - running the pipeline

Alias: r

Usage:
  uniortho run SPECIES [OPTIONS]
  uniortho run --help | -h

Options:
  --outfolder, -o OUTFOLDER
    Supply your output folder, by default, it will be in ../results
    Default: ./results

  --ani, -a
    Use the fastANI approach on the samples to remove highly similar samples

  --input_files, -i GENOMES_LIST
    Supply the file with paths to fna files that need to be included
    Default: 

  --threads, -t THREADS
    Supply number of threads used
    Default: 16

  --gtdb_version_url, -g URL
    to use a different version of gtdb, supply the URL here
    Default: https://data.gtdb.ecogenomic.org/releases/latest/bac120_metadata.tsv.gz

  --completeness, -c COMPLETENESS
    Supply the minimum completeness threshold to select genomes on
    Default: 95

  --contamination, -C CONTAMINATION
    Supply the maximum contamination threshold to select genomes on
    Default: 5

  --blast, -b
    check the unique orthogroups by blasting them against the NCBI nt database.
    This takes a long time to run.

  --help, -h
    Show this help

Arguments:
  SPECIES
    name of the species, leading with the s__ (using gtdb taxonomy)

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github		.github
docs		docs
setup		setup
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
settings.yml		settings.yml
uniortho		uniortho

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

uniortho

Installing uniortho

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

uniortho

Installing uniortho

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages