Draft
Conversation
7ab52ff to
8fc8427
Compare
c4ed35f to
0263f19
Compare
0263f19 to
01b2c30
Compare
tensors and eisum called in C++ function on already permuted tensors.
implementations of the BWD kernel when the BWD pass is upsampling.
1d747aa to
ad2e565
Compare
…owers of 2 and remainder
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This MR streamlines DISCO and Attention kernels. Both are now using channels last format for improved performance.
Furthermore, more tests were added to for various convolution tensor shapes, and also integrity tests to check for compatibility for the new kernel.
WARNING, this MR breaks backward compatibility:
We changed the normalization of the piecewise linear basis functions, models using those with mean normalization mode need to be re-trained. The reason for this is that now the mean includes zero-support values, which is more in-line with what is done for Zernike and Morlet. Previously, zero support values were excluded from the mean in the normalization.
Holding off with merging till we decide how to handle 128 bit atomics.