Skip to content

Conversation

@kekaczma
Copy link
Contributor

@kekaczma kekaczma commented Dec 29, 2025

SYCL kernel names are generated during template instantiation using MangleContext, which produces different encodings based on target ABI. When host compilation uses Microsoft ABI and device compilation targets Itanium ABI (CUDA/HIP), this caused runtime kernel lookup failures:

No kernel named _ZTSZZ21performIncrementationENK... was found

The issue occurred because SYCL kernel name generation in SemaSYCL::SetSYCLKernelNames() and SemaSYCL::finalizeFreeFunctionKernels() used ASTContext::createMangleContext(), which creates a mangling context for the primary target. In cross-ABI scenarios (Microsoft host + Itanium device), this produced Microsoft-style mangling on the host side, while device compilation always used Itanium mangling, resulting in name mismatches.

Solution:
Added createSYCLCrossABIMangleContext() helper in SemaSYCL.cpp that detects Microsoft-to-Itanium ABI cross-compilation scenarios
Checks if primary target uses Microsoft ABI (getCXXABI().isMicrosoft()) and auxiliary target (device) uses Itanium ABI (isItaniumFamily())
When detected, uses createDeviceMangleContext(*AuxTarget) to ensure Itanium mangling on both host and device sides
When same ABI or no offload target, falls back to standard createMangleContext()
Applied in two locations: SetSYCLKernelNames() and finalizeFreeFunctionKernels()

This ensures identical kernel names across host and device compilations, allowing runtime lookup to succeed.

Fixes: #14733

@kekaczma kekaczma requested review from a team as code owners December 29, 2025 14:46
@kekaczma kekaczma requested a review from slawekptak December 29, 2025 14:46
@kekaczma kekaczma marked this pull request as draft December 30, 2025 09:59
@kekaczma kekaczma force-pushed the CMPLRLLVM-69642-sycl-cuda-kernel-mangling branch from 32c9f70 to 55c155c Compare December 30, 2025 10:35
@kekaczma kekaczma force-pushed the CMPLRLLVM-69642-sycl-cuda-kernel-mangling branch from 55c155c to 307759c Compare December 30, 2025 11:42
@kekaczma kekaczma force-pushed the CMPLRLLVM-69642-sycl-cuda-kernel-mangling branch from 451b21b to 7445a0a Compare December 30, 2025 12:44
@kekaczma kekaczma force-pushed the CMPLRLLVM-69642-sycl-cuda-kernel-mangling branch from 520457d to ede11a7 Compare January 13, 2026 16:57
@kekaczma kekaczma force-pushed the CMPLRLLVM-69642-sycl-cuda-kernel-mangling branch from 82281c9 to f2a7f1f Compare January 16, 2026 16:23
@kekaczma kekaczma force-pushed the CMPLRLLVM-69642-sycl-cuda-kernel-mangling branch 2 times, most recently from 08ed803 to 3a82903 Compare January 19, 2026 15:47
@kekaczma kekaczma force-pushed the CMPLRLLVM-69642-sycl-cuda-kernel-mangling branch from 3a82903 to 1838f7f Compare January 19, 2026 15:50
In SYCL offload compilation, when the host uses Microsoft ABI and the
device uses Itanium ABI (e.g., Windows host with CUDA device), kernel
names were mangled differently between host and device code, causing
runtime errors when the host tried to find kernels by name.

This fix ensures consistent mangling by using the device's mangling
context when compiling for cross-ABI scenarios. The logic is implemented
in SYCL-specific code (SemaSYCL.cpp) to avoid adding SYCL-specific code
to general Clang infrastructure.

Removes XFAIL from Printf tests that were failing due to this issue.
Fixes #14733.
@kekaczma kekaczma changed the title [WIP][SYCL] Fix kernel name mangling for CUDA/HIP with Microsoft ABI host [SYCL] Fix kernel name mangling for CUDA/HIP with Microsoft ABI host Jan 20, 2026
@kekaczma kekaczma marked this pull request as ready for review January 20, 2026 18:28
@kekaczma kekaczma requested a review from a team as a code owner January 20, 2026 18:28
@kekaczma kekaczma marked this pull request as draft January 20, 2026 18:42
@kekaczma kekaczma marked this pull request as ready for review January 20, 2026 18:50
@kekaczma
Copy link
Contributor Author

Graph/RecordReplay/host_task_in_order_dependency.cpp failure is a known issue (#20826), not related to this PR.

Copy link
Contributor

@YuriPlyakhin YuriPlyakhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCL E2E tests change LGTM

Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't get how it helps.
Both finalizeFreeFunctionKernels and SetSYCLKernelNames are called for device only, meaning the target is device that is always itanium ABI only and host is aux target that can be microsoft OR itanium.

There is a function InitDeviceMC in clang/lib/CodeGen/CGCUDANV.cpp which seems to be doing the right thing, we probably should reuse it.
The question is, if the unxfailed tests pass with this change which seems to be not changing the mangling situation, do we need the change at all?

If I'm all wrong and the patch is actually correct due to me missing some cuda-specific details, this should have LIT tests checking the integration header content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tests SYCL :: Printf/{mixed-address-space, percent-symbol.cpp} failing on CUDA && Windows

4 participants