Use TensorRT to accelerate the Depth-Anything-V2 model for monocular depth estimation.
Clone this repository and submodules:
git clone https://github.com/Stillwtm/depth-anything-tensorrt.git
git submodule init && git submodule updateInstall dependencies:
pip install tensorrt==10.2.0.post1Download the Depth-Anything-V2 checkpoints from official repository, and put them under the checkpoints folder.
Replace the third_party/depth_anything_v2/depth_anything_v2/dpt.py file with the tools/dpt.py. In tools/dpt.py, we remove the squeeze operation in the forward function, which will affect the inference performance of TensorRT models.
python tools/export_onnx.py --checkpoint <path to checkpoint> --onnx <path to save onnx model> --input_size <dpt input size> --encoder <dpt encoder> --batch <batch size> [--dynamic_batch] [--metric] [--max_depth <max depth>]python tools/onnx2trt.py --onnx <path to onnx model> --engine <path to save trt engine> [--fp16]You can also enable dynamic batch size for TensorRT engine (If you want to use dynamic batch size here, also remember to enable it in the previous ONNX model conversion step):
python onnx2trt.py --onnx <path to onnx model> --engine <path to save trt engine> [--fp16] --dynamic_batch --min_batch <minimum batch size> --max_batch <maximum batch size> --opt_batch <optimum batch size>Try to decrease max_batch if you encounter a failure (possibly due to OOM error).
After converting the model to TensorRT, you can use the engine file for inference.
For a single image, use:
python infer.py --img <path to image> --engine <path to trt engine> [--grayscale]For a video, use:
python infer_video.py --video <path to video> --engine <path to trt engine> [--batch <batch size>] [--grayscale]You can also use a webcam for real-time inference:
python infer_webcam.py --webcam <path to video> --engine <path to trt engine> [--grayscale]