This repository provides a complete, end-to-end framework for converting any Ultralytics YOLO model (Detection, Classification, Pose Estimation) into the lightweight OpenVINO format and deploying it as a scalable, low-cost serverless function on AWS Lambda.
Deploying PyTorch models directly into serverless environments like AWS Lambda is often impossible due to:
- Massive Dependencies: PyTorch and related libraries (ultralytics, torchvision, etc.) are large and easily exceed serverless package size limits.
- Cold Start Latency: Loading these heavy frameworks causes slow response times.
- Cost: Larger memory and package sizes result in higher operational costs.
This project demonstrates a core MLOps principle: separating the heavy training/conversion environment from the lightweight, optimized production environment.
├── models/ # Holds exported OpenVINO models
├── notebooks/
│ └── test_inference.ipynb # Notebook for testing production-ready code
├── scripts/
│ └── export_model.py # (Dev) Convert .pt models to OpenVINO
├── src/
│ └── lambda_function.py # Handler code for AWS Lambda
│ └── inference/ # (Prod) Lightweight, PyTorch-free inference code
│ ├── classifier.py
│ ├── detector.py
│ ├── pose_estimator.py
│ └── utils.py
├── Dockerfile # Defines serverless container environment
├── requirements_dev.txt # Heavy dependencies (conversion)
└── requirements_prod.txt # Lightweight dependencies (inference)
This stage requires heavy libraries (ultralytics, torch, openvino-dev) to convert your trained .pt models.
Setup the Development Environment:
pip install -r requirements-dev.txtRun the Export Script:
python scripts/export_model.py \
--model-path "path/to/your/yolov11n.pt" \
--task "detect" \This will create a yolov11n_openvino_model folder inside models/ containing:
.xml.binmetadata.yaml
Examples:
python scripts/export_model.py --model-path models/yolo11n.pt --task detect
python scripts/export_model.py --model-path models/yolo11n-cls.pt --task classify
python scripts/export_model.py --model-path models/yolo11n-pose.pt --task poseThis stage uses a minimal, lightweight environment. It only needs OpenVINO and basic dependencies.
Setup the Production Environment:
pip install -r requirements-prod.txtLocal Testing: Before deploying, test your exported models locally:
# Start Jupyter from project root
jupyter notebookOpen and run:
notebooks/test_inference.ipynb
The included Dockerfile, Serverless.yml and src/lambda_function.py allow deployment as a serverless function.
-
In Serverless YML file
- Change the name of the service
- Add ro rename the lambda function's name
- Memory →
1024 MB - Timeout →
30s
- Memory →
-
Add .env file containing the environment variables:
YOLO11_DET_XML_PATH=/models/yolo11n_openvino_model/yolo11n.xml
YOLO11_CLS_XML_PATH=models/yolo11n-cls_openvino_model/yolo11n-cls.xml
YOLO11_POSE_XML_PATH=/app/models/yolo11n-pose_openvino_model/yolo11n-pose.xml
sls deployTest with API Gateway or Lambda console's "Test" tab. Send JSON with base64-encoded image:
{
"image": "<your_base64_encoded_image_string>"
}