-
Notifications
You must be signed in to change notification settings - Fork 369
cpu memory optimization rebased to main #3868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b9b6aeb to
51f64f0
Compare
|
|
||
| @needs_refit # type: ignore[misc] | ||
| def _insert_engine_to_cache(self, hash_val: str, serialized_engine: bytes) -> None: | ||
| def _insert_engine_to_cache(self, hash_val: str, engine: trt.ICudaEngine) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zewenli98 when do these calls run? will this conflict with the goal of keeping mem usage under 3x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we do caching in a post processing step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like we can give the cache entry as one of the Interpreter Result fields
py/torch_tensorrt/dynamo/partitioning/_adjacency_partitioner.py
Outdated
Show resolved
Hide resolved
51f64f0 to
f77df5a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Engine caching part LGTM. just a minor comment
4fddd81 to
af732b7
Compare
| ((6, 7, 5, 4, 5),), | ||
| ] | ||
| ) | ||
| @unittest.skip("Skipping prod dim int default test for now") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do xfail and provide a reason for the skip?
| # PLATFORM_SUPPORTS_CUDNN_ATTENTION, | ||
| # "Platform doesn't support cuDNN attention", | ||
| # ) | ||
| @unittest.skip("Skipping cuDNN attention test for now") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Provide reasons for the skip
af732b7 to
65565fd
Compare
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Please delete options that are not relevant and/or add your own.
Checklist: