Skip to content

Conversation

@maedtb
Copy link
Contributor

@maedtb maedtb commented Jul 13, 2024

  • Adds a new thread-safe python class TrTErrorRecorder which implements the TensorRT IErrorRecorder interface. This class captures errors to display to the user, and can optionally terminate TensorRT processing when errors occur.
  • We now set the error_recorder field on the TensorRT tensorrt.Builder and tensorrt.Runtime classes to an instance of TrTErrorRecorder.
  • We now check for errors while initializing TensorRT engines, raising exceptions if TensorRT reports any errors to us.

- Adds a new thread-safe python class `TrTErrorRecorder` which implements the TensorRT `IErrorRecorder` interface. This class captures errors to display to the user, and can optionally terminate TensorRT processing when errors occur.
- We now set the `error_recorder` field on the TensorRT `tensorrt.Builder` and `tensorrt.Runtime` classes to an instance of `TrTErrorRecorder`.
- We now check for errors while initializing TensorRT engines, raising exceptions if TensorRT reports any errors to us.
@maedtb
Copy link
Contributor Author

maedtb commented Jul 13, 2024

I ran into several errors using TensorRT, and it was unclear why these errors were happening. Adding the TensorRT error reporter cleared up the issues for me entirely. The error messages TensorRT report to us look something like this:

[defaultAllocator.cpp::allocate::31] Error Code 1: Cuda Runtime (out of memory)
[executionContext.cpp::ExecutionContext::565] Error Code 2: OutOfMemory (Requested size was 30152807424 bytes.)

It's not the most user friendly, but it's a lot more helpful of an error than a 'NoneType' object has no attribute 'set_input_shape' :^)

@mcmonkey4eva
Copy link

Tested this - it works properly for converting and generating, but an error during generation while it logs properly doesn't raise an exception
image

so it just generates a black image while spamming console with a new error every step

@comfyanonymous
Copy link
Owner

[07/15/2024-23:30:09] [TRT] [I] [MemUsageStats] Peak memory usage during Engine building and serialization: CPU: 13075 MiB
[07/15/2024-23:30:09] [TRT] [I] Serialized 16605 bytes of code generator cache.
[07/15/2024-23:30:09] [TRT] [I] Serialized 4704604 bytes of compilation cache.
[07/15/2024-23:30:09] [TRT] [I] Serialized 2218 timing cache entries
Segmentation fault

This PR gives me a segfault when I try to convert SD1.5
image

@maedtb
Copy link
Contributor Author

maedtb commented Aug 7, 2024

Just leaving an note here that I'm planning on investigating these issues this weekend and I haven't abandoned this. Going to pull it into Draft in the mean time.

@maedtb maedtb marked this pull request as draft August 7, 2024 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants