Currently available Phanta databases

Name/Link	Prokaryotic Portion	Viral Portion	Prophage-masked?	Taxonomy for Prokaryotic Portion	Comments
Default database	HumGut	MGV + RefSeq viral	N	NCBI	Default database (as described in our manuscript)
Masked version of default database	HumGut	MGV + RefSeq viral	Y	NCBI	Prophage-masked version of default database (as described in our manuscript)
Default database - GTDB	HumGut	MGV + RefSeq viral	N	GTDB	Default database with GTDB taxonomy for prokaryotic portion
UHGGV2 + MGV	UHGGV2	MGV + RefSeq viral	N	GTDB	Default database with UHGGv2 replacing HumGut. UHGGv2 includes low-prevalence prokaryotes filtered by HumGut
HumGut + UHGV "MQ+"	HumGut	UHGV ($\ge$ medium-quality genomes)	N	NCBI	Same as default database but replacing the viral portion with new viral genome catalog UHGV. Here we included UHGV genomes $\ge$ medium-quality
HumGut + UHGV "HQ+"	HumGut	UHGV ($\ge$ high-quality genomes)	N	NCBI	Same as previous line but using only $\ge$ high-quality UHGV genomes
UHGGv2 + UHGV "MQ+"	UHGGV2	UHGV ($\ge$ medium-quality genomes)	N	GTDB	UHGGv2 for prokaryotic portion; UHGV for viral portion ($\ge$ medium-quality genomes)

Each database should include the following files:

Kraken2 database

hash.k2d
taxo.k2d
opts.k2d
seqid2taxid.map

Bracken databases (built for use with various read lengths N):

databaseNmers.kmer_distrib

Additional files required for pipeline to run:

inspect.out
taxonomy/nodes.dmp
taxonomy/names.dmp
library/species_genome_size.txt

For use with post-processing scripts:

host_prediction_to_genus.tsv
species_name_to_vir_score.txt

Note: Phanta was developed with human gut metagenomes in mind. Phanta's default database was built based on human-gut viral and bacterial genomes. If you wish to apply Phanta on non human gut metagenomes you'll probably need to supply a custom database. In such cases please open new discussion and we can discuss the best way to help/collaborate on that.

The total tar.gz file should be about 20-25 GB (depends on the exact version).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Currently available Phanta databases

Each database should include the following files:

FilesExpand file tree

databases.md

Latest commit

History

databases.md

File metadata and controls

Currently available Phanta databases

Each database should include the following files: