Further details about the construction of new phanta databases #54
efratmuller
announced in
Announcements
Replies: 1 comment 4 replies
-
|
Hi Efrat, For point 1:
For point 2:
For point 3:
|
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Dear @yipinto and team!
I was wondering if you can perhaps share some more details about how you constructed the "UHGGv2 + UHGV "MQ+"" database you have kindly provided?
Specifically, I had 3 questions:
(1) The viral portion based on UHGV seems to also include viruses that did not meet the "MQ" (medium-quality) criteria as defined on the UHGV github. For example, vOTU-085841 is included in the phanta database but had an "uncertain" "viral-confidence" (as reported in UHGV's metadata) and therefore does not meet their "MQ" criteria. As another example, vOTU-018648 is only 49% complete (therefore not "MQ") but I see it in the database.
(2) In the UHGG2 portion it seems as though not all genomes were included (>280K genomes are listed in MGnify-UHGGv2). Can you please explain which genomes were included and how were they defined as strains/species in the db?
(3) Where were the non-bacterial/archeal/viral genomes sourced from?
Many many thanks in advance!
Efrat
Beta Was this translation helpful? Give feedback.
All reactions