-
Notifications
You must be signed in to change notification settings - Fork 1
feat: better dynamic axes #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… refactoring for new datatypes
…ike. Now need to update engine.cpp logic
…ill need to finish the output tensor part of infer
freeman94
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly notes for future PRs
| pkgs-unstable = import nixpkgs-unstable { | ||
| inherit system; | ||
| config.allowUnfree = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to keep on a stable release
| struct TensorInfo { | ||
| name: String, | ||
| dims: Vec<u32>, | ||
| shape: Vec<i64>, // -1 for dynamic dimensions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice if we could convert this into an enum at some point in the future, rather than having to check for -1 and then interpret the min/max/opt fields.
| name: String, | ||
| data: Vec<u8>, | ||
| shape: Vec<i64>, // this should always be positive, just i64 for convenience | ||
| dtype: TensorDataType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be genericized by dtype so that the data Vec is appropriately cast without the user having to do so.
|
|
||
| // ASSUMPTION: we always use optimization profile 0 | ||
| // set the optimization profile to 0 so we can query output shapes after setting input shapes | ||
| mContext->setOptimizationProfileAsync(0, stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to make this configurable in the future.
…d only effect it on load and unload of the model
Description
Added support for as many dynamic axes as you want wherever you want. It doesn't just have to be the first dimension of a tensor.
Testing
All the examples run
Also integrated into a certain saronic downstream program and it works
Notes
Finished adding support for dynamic axes in libinfer wherever whenever: Some comparison
libinfer 0.0.4 DETR benchmark.rs
libinfer 0.0.5 (dynamic axes of death) DETR benchmark.rs
libinfer 0.0.4 yolov8
libinfer 0.0.5 yolov8
I found a major optimization bug where we were prematurely synchronizing the cuda stream. I introduced this in 0.0.4. By removing this we have a pretty massive performance improvement on larger models. Strangely I am getting better performance on the new tracker trained DETR model than yolov8. The DETR model is quite a bit larger and has two transformers so I am suprised. Not complaining though, this is nearly a 2x performance improvement
We are still IO bound on f32 output tensors. Will save that for 0.0.6