-
Notifications
You must be signed in to change notification settings - Fork 809
[SYCL] Fix kernel name mangling for CUDA/HIP with Microsoft ABI host #20973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
32c9f70 to
55c155c
Compare
55c155c to
307759c
Compare
451b21b to
7445a0a
Compare
520457d to
ede11a7
Compare
82281c9 to
f2a7f1f
Compare
08ed803 to
3a82903
Compare
3a82903 to
1838f7f
Compare
In SYCL offload compilation, when the host uses Microsoft ABI and the device uses Itanium ABI (e.g., Windows host with CUDA device), kernel names were mangled differently between host and device code, causing runtime errors when the host tried to find kernels by name. This fix ensures consistent mangling by using the device's mangling context when compiling for cross-ABI scenarios. The logic is implemented in SYCL-specific code (SemaSYCL.cpp) to avoid adding SYCL-specific code to general Clang infrastructure. Removes XFAIL from Printf tests that were failing due to this issue. Fixes #14733.
|
Graph/RecordReplay/host_task_in_order_dependency.cpp failure is a known issue (#20826), not related to this PR. |
YuriPlyakhin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SYCL E2E tests change LGTM
Fznamznon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't get how it helps.
Both finalizeFreeFunctionKernels and SetSYCLKernelNames are called for device only, meaning the target is device that is always itanium ABI only and host is aux target that can be microsoft OR itanium.
There is a function InitDeviceMC in clang/lib/CodeGen/CGCUDANV.cpp which seems to be doing the right thing, we probably should reuse it.
The question is, if the unxfailed tests pass with this change which seems to be not changing the mangling situation, do we need the change at all?
If I'm all wrong and the patch is actually correct due to me missing some cuda-specific details, this should have LIT tests checking the integration header content.
SYCL kernel names are generated during template instantiation using MangleContext, which produces different encodings based on target ABI. When host compilation uses Microsoft ABI and device compilation targets Itanium ABI (CUDA/HIP), this caused runtime kernel lookup failures:
No kernel named _ZTSZZ21performIncrementationENK... was found
The issue occurred because SYCL kernel name generation in SemaSYCL::SetSYCLKernelNames() and SemaSYCL::finalizeFreeFunctionKernels() used ASTContext::createMangleContext(), which creates a mangling context for the primary target. In cross-ABI scenarios (Microsoft host + Itanium device), this produced Microsoft-style mangling on the host side, while device compilation always used Itanium mangling, resulting in name mismatches.
Solution:
Added createSYCLCrossABIMangleContext() helper in SemaSYCL.cpp that detects Microsoft-to-Itanium ABI cross-compilation scenarios
Checks if primary target uses Microsoft ABI (getCXXABI().isMicrosoft()) and auxiliary target (device) uses Itanium ABI (isItaniumFamily())
When detected, uses createDeviceMangleContext(*AuxTarget) to ensure Itanium mangling on both host and device sides
When same ABI or no offload target, falls back to standard createMangleContext()
Applied in two locations: SetSYCLKernelNames() and finalizeFreeFunctionKernels()
This ensures identical kernel names across host and device compilations, allowing runtime lookup to succeed.
Fixes: #14733