-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Vulkan: Add GGML_OP_GET_REL_POS
#17417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
* Introduced new Vulkan pipeline for get_rel_pos operation for both float and half-precision types. * Implemented the get_rel_pos compute shader to calculate relative positions based on input tensor dimensions. * Updated shader generation to include new get_rel_pos variants.
* Added a small epsilon to the position calculation in the get_rel_pos compute shader to mitigate floating point precision issues. This change ensures more accurate results when computing relative positions based on input tensor dimensions.
ggml/src/ggml-vulkan/ggml-vulkan.cpp
Outdated
| vk_op_unary_push_constants pc = vk_op_unary_push_constants_init(src0, dst, ggml_nelements(dst)); | ||
| init_pushconst_fastdiv(pc); | ||
|
|
||
| std::array<uint32_t, 3> elements; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use ggml_vk_op_f32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because incontiguous input is not supported. The GGML_ASSERT check in ggml_vk_op_f32 will fail.
I think the behavior should be similar to GGML_OP_ARGSORT (while I'm not sure why GGML_OP_ARGSORT appears in ggml_vk_op_f32 but comes with a GGML_ASSERT(0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARGSORT used to use ggml_vk_op_f32, it just switched away from it because it needs custom logic and shader invocations for the handling of large input tensors.
I think you should be able to use ggml_vk_op_f32 for this op. Can you be more specific which assertion failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @0cc4m, thanks for the clarification.
Previously I encountered an assertion failure for the case GET_REL_POS(type=f32, C=1, qh=1, kh=1, v=1) in ggml_vk_op_f32, specifically on this line:
ggml/src/ggml-vulkan/ggml-vulkan.cpp:8764: GGML_ASSERT(ggml_vk_op_supports_incontiguous(op) || ggml_vk_dim01_contiguous(src0)) failedI reviewed the whole process and realized that ggml_is_contiguous returned true here (src0 ne={1, 1, 1, 1}, viewed from ne={2, 1, 1, 1}). Therefore, ggml_backend_vk_device_supports_op(...) returned true, and then the check ggml_vk_dim01_contiguous(src0) later in ggml_vk_op_f32 failed.
I have updated ggml_vk_get_rel_pos to use ggml_vk_op_f32, and ggml_vk_dim01_contiguous has been added to ggml_backend_vk_device_supports_op so GET_REL_POS(type=f32,C=1,qh=1,kh=1,v=1) is no longer considered supported.
To work with incontiguous inputs (v=1), we can simply use ggml_cont.
Description
This PR is a follow-up to PR #17383 and adds support for the
GGML_OP_GET_REL_POSoperator in the Vulkan backend for both F16 and F32 data types.To verify the implementation while keeping it tidy, I patched the changes from PR #17383 onto a separate branch and ran
test-backend-ops -o GET_REL_POSfor testing. Below is a screenshot of the test-run for reference:Notes
test_get_rel_poschanges. Once that PR is merged, I will rebase this branch.posto avoid floating-point precision issues.Without the epsilon, some test cases fail on my Windows AMD GPU (but work fine on a Linux Nvidia GPU):