Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the PR CI errors #70

Closed
wants to merge 5 commits into from
Closed

Fix the PR CI errors #70

wants to merge 5 commits into from

Conversation

xuzhao9
Copy link
Contributor

@xuzhao9 xuzhao9 commented Nov 21, 2024

We still need to patch HSTU at runtime, we do not need to patch xformers as it is already installed.
Move all compilation artifacts to $REPO_DIR/.data so that we do not need to recompile colfax and tk.

We can now enable all flash_attention kernels except TK and _ws. gemm kernels have out of shared memory error so still disabled.

@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 21, 2024 23:53 — with GitHub Actions Inactive
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 22, 2024 00:09 — with GitHub Actions Inactive
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 22, 2024 00:10 — with GitHub Actions Inactive
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 22, 2024 00:10 — with GitHub Actions Inactive
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 22, 2024 01:41 — with GitHub Actions Inactive
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66341952

@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

manman-ren and others added 2 commits November 21, 2024 23:53
Summary:
We move the static_assert to the top-level kernel. After moving, the static_assert will be caught by autotuner:
        try:
            return self.do_bench(kernel_call, quantiles=(0.5, 0.2, 0.8))
        except (OutOfResources, CompileTimeAssertionFailure, PTXASError):
            return [float("inf"), float("inf"), float("inf")]

Prior to the change, CompileTimeAssertionFailure somehow is not caught and got reported and failed the build.

Verified with: python run.py --op fp8_attention
python run.py --op flash_attention --only triton_tutorial_flash_v2 --num-inputs 1 --metrics tflops --num-inputs 1

Pull Request resolved: #69

Reviewed By: xuzhao9, adamomainz

Differential Revision: D66336174

Pulled By: manman-ren

fbshipit-source-id: 95d29821e6cba45af535b11020aa51424a408789
@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@xuzhao9 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@xuzhao9 xuzhao9 closed this Nov 22, 2024
@xuzhao9 xuzhao9 reopened this Nov 22, 2024
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 22, 2024 14:12 — with GitHub Actions Inactive
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 22, 2024 14:12 — with GitHub Actions Inactive
@xuzhao9 xuzhao9 temporarily deployed to docker-s3-upload November 22, 2024 14:12 — with GitHub Actions Inactive
@xuzhao9 xuzhao9 closed this Nov 22, 2024
xuzhao9 added a commit that referenced this pull request Nov 22, 2024
Summary:
We still need to patch HSTU at runtime, we do not need to patch xformers as it is already installed.

Also, move all compilation artifacts to `$REPO_DIR/.data` so that we do not need to recompile colfax and tk.


Differential Revision: D66341952

Pulled By: xuzhao9
xuzhao9 added a commit that referenced this pull request Nov 22, 2024
Summary:
We still need to patch HSTU at runtime, we do not need to patch xformers as it is already installed.

Also, move all compilation artifacts to `$REPO_DIR/.data` so that we do not need to recompile colfax and tk.


Differential Revision: D66341952

Pulled By: xuzhao9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants