Skip to content

EyeACT/envision-classifier

Repository files navigation

envision-classifier

SetFit few-shot classifier for identifying eye imaging datasets from scientific metadata.

Part of the EyeACT project by the FAIR Data Innovations Hub.

Installation

pip install envision-classifier

Python API

from envision_classifier import EyeImagingClassifier

# Downloads model from HuggingFace on first use
clf = EyeImagingClassifier()

# Classify a single record
result = clf.classify("Retinal OCT dataset for diabetic retinopathy")
print(result)
# {'label': 'EYE_IMAGING', 'confidence': 0.999, 'probabilities': {...}}

# Classify a batch
results = clf.classify_batch([
    "Retinal fundus photography dataset for glaucoma screening",
    "COVID-19 genome sequencing data",
    {"title": "OCT images", "description": "Macular degeneration scans"},
])

# Use a local model instead of downloading
clf = EyeImagingClassifier(model_path="./my_model")

CLI

After installing, the envision-classifier command is available:

# Classify a text string
envision-classifier classify --text "Retinal OCT dataset for diabetic retinopathy"

# Classify from a JSON file
envision-classifier classify records.json

# Pipe JSON via stdin
echo '{"title": "Fundus images", "description": "DR screening"}' | envision-classifier classify

# Train a new model from built-in training data
envision-classifier train --output ./my_model

# Show model info and training data counts
envision-classifier info

Classification Labels

Label Description
EYE_IMAGING Actual eye imaging datasets (fundus, OCT, OCTA, cornea)
EYE_SOFTWARE Code, tools, models for eye imaging (no actual data)
EDGE_CASE Eye research papers, reviews, non-imaging data
NEGATIVE Not eye-related

Model

  • Base model: sentence-transformers/all-mpnet-base-v2 (768-dim)
  • Training data: 474 curated examples (77 EYE_IMAGING, 48 EYE_SOFTWARE, 79 EDGE_CASE, 270 NEGATIVE)
  • Test accuracy: 0.937, macro F1: 0.902
  • Spot-check: 29/33 (87.9%)
  • Model weights: fairdataihub/envision-eye-imaging-classifier

Related

License

MIT

About

LLM metadata classifier to discern true eye imaging data

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors