Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: avoid inflating UnmaskedArrays in broadcasting when you can #3254

Conversation

jpivarski
Copy link
Member

@martindurant
Copy link
Contributor

Thanks, @jpivarski .

Have you considered implementing something like asv for ak, to make sure that new changes don't accidentally hurt performance?

@jpivarski
Copy link
Member Author

Airspeed Velocity? Maybe, although we'd have to define some large samples and a suite of computations to test. We have a huge number of tests, since they were added as features were added and bugs were fixed, but adding a suite of performance tests would be starting from scratch because all of the functionality tests are way too small to be meaningful for performance.

This particular PR, however, won't make anything slower. There's a special case that will get faster. The only cost here is the added complexity for maintenance. I thought this one was worth the cost because it cleanly splits out as a special case and it wouldn't be a very rare special case: conversions from Arrow are likely to introduce UnmaskedArrays (because Arrow nodes are option-type by default, but often don't actually have missing values).

@martindurant
Copy link
Contributor

You mean when mask is None? Makes sense.

About asv, agree that it would be quite an undertaking, maybe a good project for a student, for a select few workflows.

Copy link
Collaborator

@pfackeldey pfackeldey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

If anything you may want to add a dedicated test with a comment to this commit. That may make future maintenance a bit easier.

@jpivarski
Copy link
Member Author

I couldn't think of a test because it doesn't affect functionality, just performance. And the array I was using to test performance is too big to save in scikit-hep-testdata.

Thanks for checking! I think I'll merge this now.

@jpivarski jpivarski merged commit eaa43ff into main Sep 24, 2024
44 checks passed
@jpivarski jpivarski deleted the jpivarski/avoid-inflating-UnmaskedArrays-in-broadcasting-when-you-can branch September 24, 2024 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants