Skip to content

Conversation

@trivoldus28
Copy link
Contributor

For TensorRT to work, we need at last 2 features.

  1. A semaphore for trt compilation. This is to prevent multiple processes from compiling at the same time and go OOM.

  2. A caching mechanism for compiled model. First, this remove the need for each process to perform compilation, saving 1-2 minutes per process. Second, this speeds up subsequent runs.

@trivoldus28 trivoldus28 requested a review from dodamih January 5, 2026 04:46
@trivoldus28 trivoldus28 marked this pull request as draft January 5, 2026 04:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants