Skip to content

Conversation

@naromero77amd
Copy link

@naromero77amd naromero77amd commented Nov 12, 2025

In the ROCm fork of PyTorch 2.7, Inductor currently has codegen support for fast_tanhf. However, it is currently guarded by TORCHINDUCTOR_USE_FAST_MATH environment variable due to some NaN issues in the original Triton implementation of fast_tanhf.

Upstream Triton has an improved fast_tanhf where the NaN issues are now fixed. This upstream commit has been backported to ROCm fork of Triton (see code comments).

Thus, I have removed the conditionalization on Triton versions as well. A bump in the Triton commit is also needed.

Other notes:

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Nov 12, 2025

Jenkins build for 1b1fde5fcc342c2c0d3c69bf95a91501fc39b324 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@naromero77amd
Copy link
Author

I have confirmed that it resolves the reproducer in the Jira.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants