Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Windows ARM64 support #656

Merged
merged 3 commits into from
Apr 10, 2024
Merged

Add Windows ARM64 support #656

merged 3 commits into from
Apr 10, 2024

Conversation

jblazquez
Copy link
Contributor

This PR solves a few issues when building for Windows on ARM:

  1. No detection for the Windows ARM64 architecture, resulting in no AAL being selected.
  2. A compilation error in bits.h due to the use of Intel intrinsics that don't exist on ARM.
  3. A misdetection of the ARM64EC ABI as AMD64, leading to the use of an unsupported Intel intrinsic.

The first two issues manifest as follows. You can repro in an ARM64 Visual Studio Developer Command Prompt:

> cl /nologo /diagnostics:caret /EHsc /std:c++20 /c malloc.cc

ds_core\bits.h(162,14): error C3861: '_tzcnt_u64': identifier not found
      return _tzcnt_u64(static_cast<unsigned __int64>(x));
             ^
ds_core\bits.h(208,21): error C3861: '_umul128': identifier not found
      size_t prod = _umul128(x, y, &high_prod);
                    ^
pal\../aal/aal.h(259,50): error C2065: 'AAL_Arch': undeclared identifier
  using Aal = AAL_Generic<AAL_NoStrictProvenance<AAL_Arch>>;

To fix these two issues, this change adds detection of the Windows ARM64 architecture so the ARM AAL is selected. It also adds implementations of snmalloc::bits::ctz and snmalloc::bits::umul that work on Windows ARM64 under MSVC.

The third issue shows up like this:

> cl /nologo /diagnostics:caret /EHsc /std:c++20 /c /arm64EC malloc.cc

aal\aal_x86.h(82,7): error C3861: '_m_prefetchw': identifier not found
      _m_prefetchw(ptr);
      ^

When building for the ARM64EC ABI, the compiler defines the _M_X64 macro and not the _M_ARM64 macro, to make incremental porting existing X64 Windows code to ARM easier.

This means that the library was detecting ARM64EC as AMD64 and defining PLATFORM_IS_X86 instead of PLATFORM_IS_ARM. This causes the library to compile in the x86 AAL and use several Intel intrinsics, which actually does work in almost all cases because Windows provides an emulated implementation of Intel SIMD instruction sets up to SSE4.2 in the softintrin.lib library. However, it does not provide an implementation of _m_prefetchw.

To fix this issue, the CPU architecture detection in aal.h is now aware of the ARM64EC ABI.

@jblazquez
Copy link
Contributor Author

@microsoft-github-policy-service agree company="Riot Games Inc."

CC @paulgray

Copy link
Member

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for the very high quality PR with the explanations of the changes. It is really appreciated.

I'm happy to take this as is, but if you think the slight simplification for umul works, then I would prefer that.

Also, if you have any thoughts on how we might add CI to cover this case then I would be happy to add that. If it isn't built in CI, it will often get broken.

@jblazquez
Copy link
Contributor Author

Added a new commit that fixes the clang-format issues and also adds ARM64/ARM64EC CI workflows. I wasn't able to test those workflows in my fork, even after enabling GitHub Actions, so they might not work just yet.

build-type: Release
arch: ARM64EC
toolchain: ""
extra-cmake-flags: -DCMAKE_SYSTEM_VERSION="10.0.22621.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the Windows 11 SDK is required for CMake to build ARM64EC correctly: https://stackoverflow.com/a/77651884

@jblazquez
Copy link
Contributor Author

As expected (since I was unable to test them) the new ARM64 CI workflows aren't working. I'll see if I can iterate on those in my fork, otherwise I'll add a few more commits here to fix them.

@@ -388,6 +388,24 @@ jobs:
build-type: Debug
arch: x64
toolchain: ""
- os: windows-2022
build-type: Release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe if you don't specify build type it will apply to both.

Copy link
Contributor Author

@jblazquez jblazquez Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to test whether removing build-type would work. I was following the pattern of the other explicitly included workflows, which manually list all the combinations needed:

          - os: windows-2022
            build-type: Release
            arch: Win32
            toolchain: ""
          - os: windows-2022
            build-type: Debug
            arch: Win32
            toolchain: ""
          - os: windows-2022
            build-type: Release
            arch: x64
            toolchain: ""
          - os: windows-2022
            build-type: Debug
            arch: x64
            toolchain: ""

It may be possible to clean the include list a bit.

@mjp41
Copy link
Member

mjp41 commented Apr 9, 2024

As expected (since I was unable to test them) the new ARM64 CI workflows aren't working. I'll see if I can iterate on those in my fork, otherwise I'll add a few more commits here to fix them.

If it's too painful just say, and I can fix tomorrow and push the changes onto your branch.

The Microsoft org on GitHub has very strict requirements for CI runs from new contributors, so sorry about the pain in getting it through CI.

Copy link
Member

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Looks like CI is going to pass. Great job. Thanks. I'll merge when CI passes.

@jblazquez
Copy link
Contributor Author

jblazquez commented Apr 9, 2024

Thanks! By the way, I just ran the test suite in a Windows 11 ARM64 VM and all tests passed.

@mjp41 mjp41 merged commit d9bca64 into microsoft:main Apr 10, 2024
46 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants