Skip to content

Conversation

fabianmcg
Copy link
Contributor

@fabianmcg fabianmcg commented Sep 3, 2025

Adds support for configuring denormal floating point math mode for f32 operations
on AMD GPUs. This includes:

  • DenormalFpMath enum with None, PreserveSign, and PositiveZero modes
  • Command line flag --iree-hip-denormal-fp-math-f32 to control the behavior
  • Attribute propagation through to LLVM kernels

I based the options on https://llvm.org/doxygen/structllvm_1_1DenormalMode.html#a29b26e3ae30f3f6ec4106ff181282893

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this attribute at the level of an executable target makes sense to me but I wonder if we may end up requiring a finer granularity in the future

(This is a non-blocking review, I need to think about this more...)

@kuhar kuhar requested a review from benvanik September 3, 2025 20:31
@kuhar
Copy link
Member

kuhar commented Sep 3, 2025

@benvanik do you want to give this a quick look?

@fabianmcg
Copy link
Contributor Author

Having this attribute at the level of an executable target makes sense to me but I wonder if we may end up requiring a finer granularity in the future

I talked with @Groverkss about this, as he expressed similar concerns. We definitely need a finer-grained mechanism that has the info per dispatch. However, having a global value is still useful, as any dispatches without a more specialized value can use the global.

I'd also argue that the current mechanism around target attributes, flags, ..., is not very extensible. It likely should be revisited, and we should likely consider using the work Renato and Rolf are doing.

@fabianmcg fabianmcg force-pushed the pr-rocdl-denormal branch 2 times, most recently from 9d86531 to a30060e Compare September 4, 2025 14:40
@fabianmcg fabianmcg requested a review from kuhar September 4, 2025 16:32
@krzysz00
Copy link
Contributor

krzysz00 commented Sep 4, 2025

I'd also like to flag that fp16 and fp64 denormal handling is a separate flag ... but also, in many cases, the same flag

@fabianmcg fabianmcg requested a review from kuhar September 8, 2025 14:31
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but please wait for a review from Ben before merging

Copy link
Contributor

@qedawkins qedawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple comments about naming and code location, otherwise LGTM. Thanks for making it work in IR and not just behind a flag!

Signed-off-by: Fabian Mora <[email protected]>
Signed-off-by: Fabian Mora <[email protected]>
Signed-off-by: Fabian Mora <[email protected]>
Signed-off-by: Fabian Mora <[email protected]>
Signed-off-by: Fabian Mora <[email protected]>
Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the flag added here for testing purpose? This is going to flush all the denormals right? I thought the point was that we only do this for dispatches that we know this is OK to do?

@fabianmcg
Copy link
Contributor Author

fabianmcg commented Sep 10, 2025

This is to have a fallback option to set a default value, similar to global behavior flags in clang. The other patch, will allow to set it for specific dispatches. If that's not wanted I can drop this patch.

Copy link
Contributor

@MaheshRavishankar MaheshRavishankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me. Thanks!

@fabianmcg fabianmcg merged commit ef18fd1 into iree-org:main Sep 11, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants