-
Notifications
You must be signed in to change notification settings - Fork 13.8k
CANN: supports out_prod operator for F32 and F16 #17406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
noemotiovon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a minor issue.
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
|
|
||
| const int64_t i12 = i2; | ||
| const int64_t i13 = i3; | ||
| aclTensor *accumulator = ggml_cann_create_tensor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result of ggml_cann_create_tensor should be acl_tensor_ptr, not aclTensor*.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited
ggml/src/ggml-cann/aclnn_ops.h
Outdated
| */ | ||
| void ggml_cann_out_prod(ggml_backend_cann_context & ctx, ggml_tensor * dst); | ||
|
|
||
| void ggml_cann_out_prod_fp(ggml_backend_cann_context & ctx, ggml_tensor * dst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
| #include <aclnnop/aclnn_index_select.h> | ||
| #include <aclnnop/aclnn_clamp.h> | ||
| #include <aclnnop/aclnn_threshold.h> | ||
| #include <aclnnop/aclnn_ger.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use
find ggml/src/ggml-cann -iname ".cpp" -o -iname ".h" | xargs clang-format -i
to format code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
| #include <aclnnop/aclnn_index_select.h> | ||
| #include <aclnnop/aclnn_clamp.h> | ||
| #include <aclnnop/aclnn_threshold.h> | ||
| #include <aclnnop/aclnn_ger.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use
find ggml/src/ggml-cann -iname ".cpp" -o -iname ".h" | xargs clang-format -i
to format code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited
ggml/src/ggml-cann/aclnn_ops.cpp
Outdated
| dst->nb, | ||
| 2); | ||
|
|
||
| GGML_CANN_CALL_ACLNN_OP(ctx, InplaceZero, accumulator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, InplaceZero is being called on each iteration of the for loop. I believe we can just call it once on dst before the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited
815e770 to
5d9578a
Compare
|
Thank you for your contribution! :) |
* DGX Spark: UMA support * Updates from PR feedback * More PR feedback cleanup * Update ggml/src/ggml-cuda/ggml-cuda.cu Co-authored-by: Georgi Gerganov <[email protected]> * Remove trailing whitespace * Update ggml/src/ggml-cuda/ggml-cuda.cu --------- Co-authored-by: Georgi Gerganov <[email protected]>

The CANN backend supports floating-point product calculations.