Skip to content

Conversation

PaulusParssinen
Copy link
Contributor

@PaulusParssinen PaulusParssinen commented Jan 22, 2025

This PR refactors the SIMD logic used in bitmap operations to reduce duplication and simplify control flow. The goal is to improve maintainability and set the stage for consistent performance across architectures.

  • Unifies vectorized paths across x86 and ARM64 with the .NET cross-architecture intrinsics.
    • Add Vector512 path. This like others only lights up and is not DCE'd during JIT import if it is supported and hardware accelerated.
    • These can be consolidated further with ISimdVector<TSelf, T> in the future
  • Adds/updates tests to cover alignment, tail bytes, and empty/odd-sized inputs
  • No functional behavior changes intended
ARM64 demo on Google Pixel 8a Now we can do BITOPs on when running Garnet inside Android VMs!
screen-20250316-174515.mp4

Follow-up work:

@PaulusParssinen
Copy link
Contributor Author

Some mixed results for benchmarks on different microarchitectures on some inputs. Secondary goal of this PR is to not regress performance, so still need to investigate that.

@PaulusParssinen
Copy link
Contributor Author

Extremely hard to measure, especially wrt. binary input TensorPrimitives fast-path non-temporal vs. temporal store difference in context of micro-benchmarks (i.e. very different from the saturated server CPU state) and with all the micro arch differences. Could probably use some more massaging in many places to find a sweet spot for most of them.

If some performance numbers seem now completely off from before on some important hardware you target, do report!

@PaulusParssinen PaulusParssinen marked this pull request as ready for review September 17, 2025 20:45
@Copilot Copilot AI review requested due to automatic review settings September 17, 2025 20:45
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the bitmap operations SIMD logic to improve cross-platform compatibility, particularly enabling BITOP operations in Android VMs. The changes replace architecture-specific x86 SIMD code with modern .NET SIMD approaches using TensorPrimitives and generic vector operations that work across different platforms.

Key changes include:

  • Refactored bitmap operation code to use TensorPrimitives and generic Vector types instead of x86-specific intrinsics
  • Simplified test structure and improved environment variable testing for different SIMD configurations
  • Added new binary operator abstractions for cross-platform bitwise operations

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/Garnet.test/TestProcess.cs Refactored process management with cleaner constructor API and improved variable naming
test/Garnet.test/GarnetBitmapTests.cs Streamlined bitmap tests with environment-based SIMD testing and comprehensive multi-size test coverage
libs/server/Storage/Session/MainStore/BitmapOps.cs Updated bitmap operation storage logic to use new cross-platform SIMD manager
libs/server/Resp/Bitmap/BitmapManagerBitOp.cs Complete rewrite using TensorPrimitives and generic Vector types for cross-platform SIMD support
libs/common/Numerics/IBinaryOperator.cs New abstraction for cross-platform binary operations with Vector support
benchmark/BDN.benchmark/Bitmap/*.cs Added benchmark classes for performance testing of new bitmap operations
Package files Added System.Numerics.Tensors dependency for new SIMD implementation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@PaulusParssinen PaulusParssinen changed the title wip: Bitmap operations SIMD logic refactor Bitmap operations SIMD logic refactor Sep 17, 2025
@PaulusParssinen PaulusParssinen changed the title Bitmap operations SIMD logic refactor Support and accelerate bitmap operations on more platforms Sep 18, 2025
@PaulusParssinen
Copy link
Contributor Author

PaulusParssinen commented Sep 18, 2025

Last CI test failure is from cluster flakiness

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants