Skip to content

[Question] Techniques to optimize model load/inference execution time #7

@ShantanuNair

Description

@ShantanuNair

Great work all, been keeping an eye on what you all are doing!

Curious what if any techniques are applied to the original model/script to get it working well in the banana environment.
Meaning, what could I, as a developer adding a new model to be deployed on banana do to the original inference script/rest endpoint, do to increase or aid banana's ability to improve model execution/ model load times? If I try to run an onnx model, will that instead slow it down? I understand banana does some additional work on our inference code, but understanding which parts it can and cannot speed up or slowdown, would help us know how/what solutions can be customized to run well on it.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions