Merge BLAST XML tool#1893
Merge BLAST XML tool#1893pvanheus wants to merge 2 commits intogalaxyproject:masterfrom pvanheus:merge_blastxml
Conversation
|
I don't mind putting this on the IUC repository, but equally it might have a home at https://github.com/peterjc/galaxy_blast/ What's the use case for this tool - does Galaxy's collection support not suffice? |
|
@peterjc I have a little workflow where I split a FASTA query, run a bunch of BLAST jobs and then get BLAST XMLs. I know this could be done using the Galaxy tasks support - effectively I was reproducing that behaviour, but using collections. Then finally the collection needs a "reduce" step to get to a single XML. Is that possible with Galaxy collections right now? If so I've probably written a tool for nothing. As to its location - the galaxy_blast collection is probably a more natural home. |
|
@jmchilton can collections do what @pvanheus wants to do? e.g. Reduce a collection of files of type XXX to a single file of type XXX, via the existing merge methods defined on the XXX datatype's Python (base) class (here BLAST XML, but equally FASTA, FASTQ, etc) |
|
Not yet @peterjc - it is a good idea though - galaxyproject/galaxy#5464. |
|
Given @jmchilton 's answer, can I propose that I re-PR this as a tool over at https://github.com/peterjc/galaxy_blast/ and we move the discussion there? |
|
fyi: Galaxy has native support for this: https://github.com/galaxyproject/galaxy/blob/30e3658b8b0e2f6b975dc6ccccb0cc8cc040247c/lib/galaxy/datatypes/blast.py#L91 The bad news is just that this feature is not really well maintained and currently broken afaik. |
|
@bgruening That's the code which has been turned into a tool here - as far as I know it is working fine, but yes the task splitting is not a top tier Galaxy feature (not used at http://usegalaxy.org and not enabled by default). |
|
@bgruening yes, it does have that support (that this code is based on, with a few minor tweaks) but it is not exposed as a tool and is, as @peterjc says, part of the "task splitting" parallelisation framework that has been languishing in the code for a while. This tool splits that functionality out into an independent tool and the map / reduce can be achieved using FASTA split -> BLAST -> merge BLAST XML. |
|
I'm happy to close this in favour of peterjc/galaxy_blast#105 |
A simple tool to merge BLAST XML datasets into a single BLAST XML dataset (based on the code in Galaxy itself).