Hi,
Thanks for the great example code!
I'm trying to compare speed between native, FP32 and INT8 with tftrt_resnet_example.py.
Prediction time for resnet50 with gtx 1080ti
I find for same method if I add a second infer, 2nd infer is much faster since the 1st infer need to load some function.
Batch_size 128
Keras 2.9s -> 0.375
tensorflow 1.4s ->0.297
TFTRT FP32 2.5s->0.172
TFTRT INT8 4.3s (Failed with second infer)
python3: helpers.cpp:56: nvinfer1::DimsCHW nvinfer1::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed.
However I got this error if I use INT8 with batch_size=128
If I set Batch size to 1
Keras 2.5s 0.011s
tensorflow 0.87s -> 0.007s
TFTRT FP32 1.98s -> 0.004s
error with TFTRT INT8:
python3: helpers.cpp:56: nvinfer1::DimsCHW nvinfer1::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed.
The INT8 will not work at all.
Do you know how should I modify the code to use INT8?
Hi,
Thanks for the great example code!
I'm trying to compare speed between native, FP32 and INT8 with tftrt_resnet_example.py.
Prediction time for resnet50 with gtx 1080ti
I find for same method if I add a second infer, 2nd infer is much faster since the 1st infer need to load some function.
Batch_size 128
Keras 2.9s -> 0.375
tensorflow 1.4s ->0.297
TFTRT FP32 2.5s->0.172
TFTRT INT8 4.3s (Failed with second infer)
python3: helpers.cpp:56: nvinfer1::DimsCHW nvinfer1::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed.
However I got this error if I use INT8 with batch_size=128
If I set Batch size to 1
Keras 2.5s 0.011s
tensorflow 0.87s -> 0.007s
TFTRT FP32 1.98s -> 0.004s
error with TFTRT INT8:
python3: helpers.cpp:56: nvinfer1::DimsCHW nvinfer1::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed.
The INT8 will not work at all.
Do you know how should I modify the code to use INT8?