Skip to content

[REQUEST] - Model serving using ONNX + Triton Inference Server #31

@ryanznie

Description

@ryanznie

Feature

Use ONNX + Triton for model serving

Reason

Production level model serving. Allows dynamic batching and model switching for increased throughput. Allows easy model monitoring via Prometheus (#30)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions