-
Notifications
You must be signed in to change notification settings - Fork 0
Database for MMseqs2 searchh #2
Copy link
Copy link
Open
Labels
improvementNew features to setupNew features to setup
Description
State of the art
For the moment, a user can pass 3 different values to --mmseqs-search-db for the reference database:
- A path to a valid MMseq database, present somewhere in the system
- A FastA/Q[.gz] file, KRYPTON setup the database from this file. The database is stored in the current project, provided by
--out - The name of a database that can be downloaded with
mmseqs database. The database is stored in the current project too.
Problematic
If the user provides option 2. or 3., and this for X samples, this means that KRYPTON will run X times the steps for the database setup and download. This can be a bottleneck and I do not want this to happen.
Solution
- Add another parameter, eg
--user-db-path, which accepts a path on the system where the user wants to store and save a given database.
It will work as:
- A valid database is passed to
--mmseqs-search-db-> ignore--user-db-path - A FastA/Q file is passed to
--mmseqs-search-db-> do--user-db-pathcontains a valid path?- No: setup the database within the project (
--out) - Yes: Do a database with this name already exists in the directory
--user-db-path?- No: setup the database here
- Yes: Perfect, use this path for the reference database
- No: setup the database within the project (
- A valid name for the downloadable by MMseqs2 is passed to
--mmseqs-search-db-> Same as for FastA file
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
improvementNew features to setupNew features to setup