Skip to content

Conversation

karthikeyann
Copy link
Collaborator

This PR adds operator CudfTopN to run TopN in GPU.
Unit tests are added to verify the output.

The CudfTopN stores top N values of each input batch, and if number of batches exceed 5, the batches are concatenated together and topN values are stored back to the batch vector. getOutput will concatenated all batches together and return topN values.

The tests are same as TopN, but Setup makes sure they run in GPU.
This PR has additional fix 2288ca9 for cudfFilterProject which affects filter results with null values.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 9, 2025
Copy link

netlify bot commented Sep 9, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 5324637
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/68c88d5595999e000828d290

@karthikeyann karthikeyann changed the title Add velox-cudf support for TopN feat(cudf): Add velox-cudf support for TopN Sep 9, 2025
Copilot

This comment was marked as outdated.

Co-authored-by: Copilot <[email protected]>
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds GPU-accelerated TopN operator support to velox-cudf integration. The implementation provides a CudfTopN operator that executes TopN operations on GPU using cuDF libraries, with comprehensive test coverage to verify correctness against CPU implementations.

  • Implements CudfTopN operator with batch-based optimization for managing memory and performance
  • Adds comprehensive test suite covering various TopN scenarios including multi-key sorting, filtering, and edge cases
  • Includes a bug fix for CudfFilterProject to properly handle null values in filter conditions

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
velox/experimental/cudf/exec/CudfTopN.h Header file defining the CudfTopN operator class with batch management strategy
velox/experimental/cudf/exec/CudfTopN.cpp Implementation of GPU-accelerated TopN operations using cuDF sorting APIs
velox/experimental/cudf/exec/ToCudf.cpp Integration of CudfTopN into the operator replacement framework
velox/experimental/cudf/tests/TopNTest.cpp Comprehensive test suite for TopN functionality verification
velox/experimental/cudf/tests/CMakeLists.txt Build configuration for TopN tests
velox/experimental/cudf/exec/CMakeLists.txt Build configuration including CudfTopN source
velox/experimental/cudf/exec/CudfFilterProject.cpp Bug fix for null handling in filter conditions

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant