Releases · steineggerlab/Metabuli

06 Apr 02:58

1.2.0

ce1098c

v1.2.0 Latest

Latest

Version used in "Sensitive and scalable metagenomic classification using spaced metamers, reduced alphabets, and syncmers"

From v1.2.0 documentation is moved to https://jaebeom-kim.github.io/metabuli-doc/

Metabuli v1.2.0

Improved sensitivity via spaced k-mers and reduced amino acid alphabet

Three layers of mismatch tolerance:
1. Amino acid-level k-mer search allows for synonymous mutations (original feature).
2. Reduced amino acid alphabet groups similar amino acids together, allowing for conservative substitutions (NEW).
3. Spaced k-mers allow for mismatches at specific positions in the k-mer (NEW).

Improved scalability via syncmers

Syncmers are a subsampling technique that selects a subset of k-mers based on the presence of specific s-mers. This reduces the database size and classification time, minimally decreasing sensitivity in remote homology detection.
Two times smaller database and two times faster classification when c = (k - s + 1) / 2 = 2.

Related options in `build` module

--syncmer: Use syncmers instead of all k-mers.
It reduces the database size and classification speed.
Reduction rate (k - s + 1) / 2 can be specified using --smer-len to set s.
As k-mers are subsampled, sensitivity in remote homology detection is decreased.
--space-mask: Use spaced k-mers instead of contiguous ones.
--custom-metamer: Specify k-mer length and customize a translation table.

Related options in `classify` module

--precise : preset for more precise but less sensitive classification
-e : maximum E-value

Assets 7

07 Jul 03:22

jaebeom-kim

1.1.1

a65c014

Metabuli v1.1.1

Version packaged in Metabuli App
Import the latest MMseqs2 as a git submodule
Added FASTA/Q format validators: fastq_utils and fasta_validator
Added database validation function: validatedb
Added classifiedRefiner for filtering or manipulating per-read classification result file.
Improved createnewtaxalist.
Improved thread-safety of the database creation process.

Assets 8

10 Feb 08:04

jaebeom-kim

1.1.0

3cd894d

Metabuli v1.1.0

Fix errors in v1.0.9
Custom DB creation became easier
Improve updateDB command

Assets 7

27 Dec 06:15

jaebeom-kim

1.0.9.2

1c5aa20

Metabuli v1.0.9 Pre-release

Pre-release

DB creation process improved

Added updateDB module for adding new sequences to an existing database.
Added --cds-info parameter in the build module. Users can provide CDS information to skip Prodigal's gene prediction.
- Currently, only NCBI RefSeq or GenBank CDS files (*cds_from_genomic.fna) are supported.
- For the accessions included in the files, the provided CDS info will be used, skipping Prodigal's gene prediction.
Added --max-ram parameter to the build module.
Added compatibility with taxdump files generated using taxonkit.
1.0.9-2: Fixations for bioconda

Assets 2

29 Sep 12:20

jaebeom-kim

1.0.8

4716a6f

Metabuli v1.0.8

Added extract module: It extracts reads classified under a specific taxon at any ranks. It can be used after running classify.

Assets 7

12 Sep 05:02

jaebeom-kim

1.0.7

b79cb21

Metabuli v1.0.7

Metabuli became faster than v1.0.6

Dataset
- Query: SRR24315757_1.fastq, SRR24315757_2.fastq
  - 22,107,398 paired-end reads
  - 6,632,219,400 nt in total
- DB: GTDB
  - Complete Genome or Chromosome level assemblies
  - CheckM completeness > 90 and contamination < 5
  - 36,203 genomes from 8,465 species
Windows: ~8.3 times faster
- Machine: Intel(R) Core(TM) i9-9900 CPU, 32GB RAM
- --max-ram: 32
- --threads: 8
- v1.0.6: 825s for the first 587,593 reads (2.7% of all). Total time not measured
- v1.0.7: 100s for the first 587,593 reads. 1h 7m 22s in total
MacOS: ~1.7 times faster
- Machine: MacBook Pro 14-inch 2023, M2 Pro chip, 32GB RAM
- --max-ram: 32
- --threads: 8
- v1.0.6: 71m 34s
- v1.0.7: 42m 58s
Linux: ~1.3 times faster
- Machine: A server with 64-core AMD EPYC 7742 CPU and 1 TB of RAM
- --max-ram : 128
- --threads : 32
  - v1.0.6: 13m 34s
  - v1.0.7: 9m 58s
- --threads : 64
  - v1.0.6: 9m 36s
  - v1.0.7: 7m 19s

Assets 2

02 Aug 12:15

jaebeom-kim

1.0.6

ef9723c

Metabuli v1.0.6

Windows OS is supported

Assets 7

18 Apr 13:21

jaebeom-kim

1.0.5

19b33ab

Metabuli v1.0.5

The CMake file was edited to pass the Bioconda PR test.
Other than that it is the same as v1.0.4.

Assets 2

26 Mar 03:23

jaebeom-kim

1.0.4

5bbc1fd

Metabuli v1.0.4

Fixed a minor reproducibility issue.
Fixed a performance-harming bug occurring with sequences containing lowercased bases.
Auto adjustment of --match-per-kmer parameter. Issue #20 solved.
Record version info. in db.parameter

Assets 6

06 Feb 05:02

jaebeom-kim

1.0.3

6fdf834

Metabuli v1.0.3

New parameter: --tie-ratio in classify module. [default 0.95]
When the best matching species has a score MAX, species with score >= (MAX * --tie-ratio) is considered as a tie to the best score. When tie species occur for a read, the read is classified into their LCA.

Assets 6

Releases: steineggerlab/Metabuli

v1.2.0

From v1.2.0 documentation is moved to https://jaebeom-kim.github.io/metabuli-doc/

Metabuli v1.2.0

Improved sensitivity via spaced k-mers and reduced amino acid alphabet

Improved scalability via syncmers

Related options in build module

Related options in classify module

Uh oh!

Metabuli v1.1.1

Uh oh!

Metabuli v1.1.0

Uh oh!

Metabuli v1.0.9

DB creation process improved

Uh oh!

Metabuli v1.0.8

Uh oh!

Metabuli v1.0.7

Uh oh!

Metabuli v1.0.6

Uh oh!

Metabuli v1.0.5

Uh oh!

Metabuli v1.0.4

Uh oh!

Metabuli v1.0.3

Uh oh!

Related options in `build` module

Related options in `classify` module