Skip to content

Multiple query genes support for SHOOT (with EPA)#7

Open
guignonv wants to merge 4 commits intodavidemms:masterfrom
guignonv:multi
Open

Multiple query genes support for SHOOT (with EPA)#7
guignonv wants to merge 4 commits intodavidemms:masterfrom
guignonv:multi

Conversation

@guignonv
Copy link

@guignonv guignonv commented May 26, 2023

This version supports multiple query genes and provides outputs by matching OGs.

There are small changes in the output files:

  • added a ".sh.map" file for correspondance between original query names and processed names (when a ".sh.cleaned" file is created)
  • ".assign.txt" now has an additional column containing the (coma-separated) list of query genes matching the OG (and the corresponding scores are also all there and coma separated)
  • ".fa.sh.msa.fa", ".fa.sh.msa.fa.query.fa", ".sh.msa.fa.ref.fa", ".fa.shoot.tree", ".sh.orthologs.tsv" and ".fa.sh.msa.fa_epa" are now prefixed with their corresponding OG names
  • ".sh.orthologs.tsv" has a new "Query" column added as first column to report the corresponding query gene
  • ".jplace" files include all the genes matching a given OG

Basically, all the query genes are grouped by matching OGs and then reintegrated in each OG in group (and not one by one).

It may need to be tested a bit more extensively by others with more dataset than mines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant