Skip to content

A command-line application designed to identify the taxonomy of a plant from a sample. GenoRobotics project for the 2025 Lemanic Life Sciences Hackathon.

Notifications You must be signed in to change notification settings

GenoRobotics-EPFL/Myrio-Hackathon

Repository files navigation

Warning

This repository is a prototype, the application has been fully rewritten from scratch here

Myrio is a command-line application designed to identify the taxonomy of a plant from a sample.

The name Myrio is inspired by the scientific name Myriophyllum spicatum, commonly known as Eurasian watermilfoil, an aquatic plant frequently found in the Léman.

If you are a contributor, please refer to this document.

demo

Long half-finished list of things left to do

  • Extend the database to vascular plants (Tracheophyta) or even further if needed
  • Complement the (BOLD) database with other sources (e.g. Genebank)
  • Compress and make a release of the database(s) used (IMPORTANT: MUST BE LICENSED UNDER CC BY-NC-SA, ATTRIBUTION SHOULD BE ADDED TO THE HEADER LINE OF THE FASTA SEQUENCES WHERE IT IS REQUIERED, WE CANNOT USE THE DATABASE ITSELF FOR COMMERCIAL PURPOSES)
  • License this codebase
  • Implement BLASTN into the main cli-pipeline
  • Implement a robust testing/scoring pipeline
  • Use a ./bin/ solution for executables used
  • Train a model to recognize the species instead of the genus
  • Make a logo
  • And much more ...

Installation (stub, for Linux)

You'll need the following tools installed beforehand: uv, rust toolchain, seqkit

References

  • Wei Shen, Botond Sipos, and Liuyang Zhao. 2024. SeqKit2: A Swiss Army Knife for Sequence and Alignment Processing. iMeta e191. doi:10.1002/imt2.191.
  • raxtax: A k-mer-based non-Bayesian Taxonomic Classifier; Noah A. Wahl, Georgios Koutsovoulos, Ben Bettisworth, Alexandros Stamatakis; bioRxiv 2025.03.11.642618; doi: https://doi.org/10.1101/2025.03.11.642618
  • Alexander J. Petri, Kristoffer Sahlin. De novo clustering of extensive long-read transcriptome datasets with isONclust3. bioRxiv 2024.10.29.620862; doi: https://doi.org/10.1101/2024.10.29.620862
  • Ratnasingham, Sujeevan, and Paul D N Hebert. “bold: The Barcode of Life Data System (http://www.barcodinglife.org).” Molecular ecology notes vol. 7,3 (2007): 355-364. doi:10.1111/j.1471-8286.2007.01678.x

About

A command-line application designed to identify the taxonomy of a plant from a sample. GenoRobotics project for the 2025 Lemanic Life Sciences Hackathon.

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •