-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the PR CI errors #70
Conversation
This pull request was exported from Phabricator. Differential Revision: D66341952 |
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: We move the static_assert to the top-level kernel. After moving, the static_assert will be caught by autotuner: try: return self.do_bench(kernel_call, quantiles=(0.5, 0.2, 0.8)) except (OutOfResources, CompileTimeAssertionFailure, PTXASError): return [float("inf"), float("inf"), float("inf")] Prior to the change, CompileTimeAssertionFailure somehow is not caught and got reported and failed the build. Verified with: python run.py --op fp8_attention python run.py --op flash_attention --only triton_tutorial_flash_v2 --num-inputs 1 --metrics tflops --num-inputs 1 Pull Request resolved: #69 Reviewed By: xuzhao9, adamomainz Differential Revision: D66336174 Pulled By: manman-ren fbshipit-source-id: 95d29821e6cba45af535b11020aa51424a408789
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: We still need to patch HSTU at runtime, we do not need to patch xformers as it is already installed. Also, move all compilation artifacts to `$REPO_DIR/.data` so that we do not need to recompile colfax and tk. Differential Revision: D66341952 Pulled By: xuzhao9
Summary: We still need to patch HSTU at runtime, we do not need to patch xformers as it is already installed. Also, move all compilation artifacts to `$REPO_DIR/.data` so that we do not need to recompile colfax and tk. Differential Revision: D66341952 Pulled By: xuzhao9
We still need to patch HSTU at runtime, we do not need to patch xformers as it is already installed.
Move all compilation artifacts to
$REPO_DIR/.data
so that we do not need to recompile colfax and tk.We can now enable all flash_attention kernels except TK and _ws. gemm kernels have out of shared memory error so still disabled.