-
Notifications
You must be signed in to change notification settings - Fork 69
Fixing tests for compute capability 12 devices. #5284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Review updated until commit 80fbfdd Description
Changes walkthrough 📝
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
0ac9f79 to
c968853
Compare
c968853 to
91d2f22
Compare
|
!test |
91d2f22 to
587812b
Compare
|
!test |
587812b to
9b5e85c
Compare
ab61f40 to
5e19036
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix looks good to me. Thanks!
5e19036 to
4a90bd2
Compare
|
!test |
Blackwell specific tests use instruction sets for sm_100/104, these tests fail for sm_110+ (like 5090 GPUs).
Calculate or assign shared memory requirements for tests that will fail to execute properly due to shared memory constraints on certain architectures.
…capabilities. These tests will fail on compute capabilities with lower shared memory / registers available per SM e.g. 12
Due to the lower shared memory available on SM/CC 12 cards this test fails to schedule static warp reductions properly, reducing the input size allows us to generate the expected pattern.
The FusionCache_CUDA test attempts to reset and get a fresh FusionCache. If the currect cache contains more fusions than what is requested for the new max_fusions limit it will fail to enforce the new max_fusions constraint.
4a90bd2 to
80fbfdd
Compare
|
!test |
Summary
The current
./manual-ci.shtest suite fails on Compute Capabiliy (CC) 12 Blackwell cards (like RTX5090) these tests fail due to architectural difference between CC 10 and 12 cars. Primarily due to differences in shared memory available per SM.1. Architecture-specific Test Skips and Guards:
2. Resource Constraint Handling:
3. Test Maintenance and Cleanup:
4. Fusion Cache Management Fix: