Fast classification of assembled bacterial genomes into ESKAPE pathogen species using k-mer frequency profiling.
uv add baclastbaclast --predict genome.fna
baclast --predict genomes/ -o results.csvimport src.classifier as baclast
baclast.predict(file="genome.fna")
baclast.to_csv(baclast.predict(file="genome.fna"), "results.csv")ESKAPE pathogens (E. faecium, S. aureus, K. pneumoniae, A. baumannii, P. aeruginosa, E. cloacae) plus an "Other" class for non-ESKAPE bacteria. Includes centroid-based out-of-distribution detection.
Computes 4-mer frequency profiles (256 features) from genome assemblies and classifies with a Random Forest. A bundled pre-trained model is included -- no training data or setup required.
Python >= 3.12, biopython, scikit-learn, joblib, numpy.
MIT