Skip to content

Conversation

@MagellaX
Copy link
Owner

@MagellaX MagellaX commented Oct 11, 2025

Summary by cubic

Improved masked-row handling in Triton fused online attention so masks and softmax match PyTorch SDPA and avoid NaNs on fully masked rows. Backward now receives the correct tile sizes; tests use a portable SDPA math context for parity.

  • Bug Fixes
    • Added explicit validity tracking in online softmax; cast q/k/v and mask to float32; apply corrections and exp only to valid rows.
    • Convert boolean attention_mask to -inf bias and load mask as qk dtype; keep fully masked rows at zero output with safe denominators.
    • Replaced tl.isfinite with lse > -inf in backward and gated grad_output by row validity.
    • Supplied TILE_M/TILE_N to the backward kernel and removed TILE_K from signatures and call sites.

@MagellaX MagellaX merged commit 03e006a into main Oct 11, 2025
3 checks passed
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (all 1 issues)

Understand the root cause of the following 1 issues and fix them.


<file name="tests/test_attention.py">

<violation number="1" location="tests/test_attention.py:32">
The test suite in `tests/test_attention.py` is missing a test case for the core bug being fixed: handling fully masked rows. The new logic in `fused_online_attention.py` is designed to prevent NaNs in this scenario, but without a dedicated test, the fix is not verified and could regress.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

from stream_attention.core.star_attention import StarAttention


def _math_sdpa_ctx():
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Oct 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test suite in tests/test_attention.py is missing a test case for the core bug being fixed: handling fully masked rows. The new logic in fused_online_attention.py is designed to prevent NaNs in this scenario, but without a dedicated test, the fix is not verified and could regress.

Prompt for AI agents
Address the following comment on tests/test_attention.py at line 32:

<comment>The test suite in `tests/test_attention.py` is missing a test case for the core bug being fixed: handling fully masked rows. The new logic in `fused_online_attention.py` is designed to prevent NaNs in this scenario, but without a dedicated test, the fix is not verified and could regress.</comment>

<file context>
@@ -20,6 +27,16 @@
 from stream_attention.core.star_attention import StarAttention
+
+
+def _math_sdpa_ctx():
+    if sdpa_kernel_ctx is not None and SDPBackend is not None:
+        return sdpa_kernel_ctx(SDPBackend.MATH)
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants