-
Notifications
You must be signed in to change notification settings - Fork 0
feat: gpu accelerated pandas with cudf example #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this example, I am not much aware of the Nvidia Rapid APIs. Just to confirm- Everything we have considered here has latest version and is relevant. correct>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely the latest version, but after making this I do wonder if this is something we're really targeting. I'm not sure how often Pandas/dataframe ops are used for llm workloads 🤷
I'm fine with not including it if we don't think it's something people would be interested in!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will let you make a call here. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a new example demonstrating GPU-accelerated data processing using NVIDIA's cuDF library with pandas for analyzing NYC taxi data. The example showcases how to leverage GPU acceleration for common data analytics operations like groupby aggregations.
Key changes:
- Creates a complete cuDF/pandas integration example with GPU acceleration
- Implements persistent storage using network volumes to cache downloaded taxi data
- Demonstrates performance-optimized data analytics workflows on GPU infrastructure
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| # This example shows some simple accelerated data analytics functionality using cudf and pandas. | ||
|
|
||
| # [cuDF](https://github.com/rapidsai/cudf) is part of the [NVIDIA RAPIDs](https://rapids.ai/) project. | ||
| # RAPIDs provides simple APIs to accelerate common Python data analytics functions with GPUs. |
Copilot
AI
Oct 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'RAPIDs' to 'RAPIDS'. The official project name is 'RAPIDS' not 'RAPIDs'.
| # RAPIDs provides simple APIs to accelerate common Python data analytics functions with GPUs. | |
| # RAPIDS provides simple APIs to accelerate common Python data analytics functions with GPUs. |
| def __init__(self): | ||
| return |
Copilot
AI
Oct 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty init method with explicit return statement is unnecessary. Either implement initialization logic or remove the method entirely.
| def __init__(self): | |
| return |
No description provided.