Skip to content

Conversation

@kshyatt
Copy link
Member

@kshyatt kshyatt commented Nov 24, 2025

No description provided.

@kshyatt kshyatt requested a review from maleadt November 24, 2025 18:34
@kshyatt kshyatt enabled auto-merge (squash) November 24, 2025 20:55
Copy link
Member

@christiangnrd christiangnrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@christiangnrd
Copy link
Member

Bump version?

@maleadt
Copy link
Member

maleadt commented Nov 24, 2025

Bump version?

Bit weird to do this as part of this PR, but OK.


I was going to suggest trimming the test suite a bit, since this is increasing CI load, but then noticed that macOS runs finish in 10 minutes. Really surprising that e.g. the gpuarrays/reductions/sum prod test suite is 3 times faster on the macOS workers (M1s) than it is on gpuci (AMD EPYC 7402 CPU). The M1 does have 1.7x higher single-threaded performance according to PassMark, which surprised me, but that's still a far cry from 3x.

@christiangnrd
Copy link
Member

christiangnrd commented Nov 24, 2025

Bit weird to do this as part of this PR, but OK.

Noted for future reviews from me.

Regarding the M1 vs Epyc, Metal supports fewer types so it runs fewer tests. I did some math for my local machines, and for gpuarrays/reductions/sum prod from your example, the tests per second ratio between metal and cuda is the same as the ratio between the passmark scores for the respective CPUs. I guess even though they're running on GPU the tests are so small that it's all CPU-limited regardless

@kshyatt kshyatt merged commit 796cfd8 into master Nov 24, 2025
18 checks passed
@kshyatt kshyatt deleted the ksh/latest branch November 24, 2025 22:54
@luraess
Copy link
Member

luraess commented Nov 24, 2025

yay all CI passes now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants