-
Notifications
You must be signed in to change notification settings - Fork 377
Roofline quantized conv3d/2d layer #3419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3419
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 30dc793 with merge base 73730e8 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
8fba23d to
79cdaec
Compare
| and not is_sm_at_least_100() | ||
| ) | ||
|
|
||
| if skip_conv_benchmarks: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel conditions seems a bit convoluted here
maybe:
if do_benchmarks:
if op_name in ("conv2d", "conv3d") and not is_sm_at_least_100():
print warning
else:
# can also move this part to a function to make it clearer
....
| r_speedup = None | ||
| # use roofline model to estimate gemm time using equivalent GEMM dims | ||
| r_bf16_gemm_time_s = float( | ||
| bf16_gemm_time_sympy.subs(M, gemm_M).subs(K, gemm_K).subs(N, gemm_N) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the memory operations of conv the same as linear well?
ao/torchao/testing/training/roofline_utils.py
Line 332 in 0975a40
| mem_gemm_time_s = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As conv is an implicit gemm, I'm assuming the memory operations for gemm and conv should be same.
| print() | ||
| else: | ||
| # TODO: enable roofline analysis for conv | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we share the same thing, we should remove this if/else branch and inline the code in if here I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept if/else just incase of an unexpected op_name. We can either make the code verify op_name in beginning to avoid any errors, or assume the input for op_name will always be correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already verify that in L264 I think
|
|
||
| # real gemm benchmark time, also not added yet | ||
| # if enabled, also measured observed gemm time | ||
| # gemm benchmarks for conv not implemented, as conv uses implicit GEMM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should run the conv ops I think?
No description provided.