Skip to content

Conversation

@giuseppe
Copy link
Contributor

mostly mechanical changes, except ROUND that doesn't match directly to Vulkan as there is no equivalent rounding mode (at least didn't manage to find it)

@giuseppe giuseppe requested a review from 0cc4m as a code owner November 17, 2025 11:07
@giuseppe giuseppe force-pushed the add-more-ops-vulkan branch from 67689fd to e59509c Compare November 17, 2025 13:04
@giuseppe giuseppe changed the title vulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP, ROUND, CEIL, FLOOR vulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP, ROUND, CEIL, FLOOR, TRUNC Nov 17, 2025
@github-actions github-actions bot added documentation Improvements or additions to documentation Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Nov 17, 2025
CREATE_UNARY_RTE(exp)
#undef CREATE_UNARY_RTE

ggml_vk_create_pipeline(device, device->pipeline_add1_f16_f16, "add1_f16_f16", add1_f16_f16_len, add1_f16_f16_data, "main", 3, sizeof(vk_op_binary_push_constants), {256, 1, 1}, {}, 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each invocation does 2 iterations and the WG has 256 threads, so I think this should be 512?

layout (binding = 0) writeonly buffer D {D_TYPE data_d[];};

void main() {
const uint i = gl_GlobalInvocationID.z * 262144 + gl_GlobalInvocationID.y * 512 + gl_GlobalInvocationID.x;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these index calculations rely on this logic in ggml_vk_op_f32, but arange doesn't go through ggml_vk_op_f32:

            if (ne > 262144) {
                elements = { 512, 512, CEIL_DIV(ne, 262144) };
            } else if (ne > 512) {
                elements = { 512, CEIL_DIV(ne, 512), 1 };
            } else {
                elements = { ne, 1, 1 };
            }

@giuseppe
Copy link
Contributor Author

@jeffbolznv thanks for the review. Addressed and pushed a new version

layout (binding = 0) writeonly buffer D {D_TYPE data_d[];};

void main() {
const uint i = gl_GlobalInvocationID.z * 262144 + gl_GlobalInvocationID.y * 512 + gl_GlobalInvocationID.x;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one has the same issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed as well! Thanks

@giuseppe giuseppe force-pushed the add-more-ops-vulkan branch from 26eeb8a to d7df09c Compare November 17, 2025 20:17
Copy link
Collaborator

@jeffbolznv jeffbolznv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me now. Do you think it makes sense to add a few larger test cases for these to catch those bugs?

@giuseppe
Copy link
Contributor Author

The changes look good to me now. Do you think it makes sense to add a few larger test cases for these to catch those bugs?

do you've any suggestions on what to add? I've tried a few new cases with more elements, but they don' fail with the older version of the PR

@jeffbolznv
Copy link
Collaborator

Nothing specific, just that they would need to be more than 256k elements.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe force-pushed the add-more-ops-vulkan branch from d7df09c to 2300634 Compare November 18, 2025 09:17
@giuseppe giuseppe requested a review from slaren as a code owner November 18, 2025 09:17
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe force-pushed the add-more-ops-vulkan branch from 2300634 to 15bcb5e Compare November 18, 2025 09:26
@giuseppe
Copy link
Contributor Author

Nothing specific, just that they would need to be more than 256k elements.

added new test cases

@github-actions github-actions bot added the testing Everything test related label Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning testing Everything test related Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants