Skip to content

Conversation

@naoyam
Copy link
Collaborator

@naoyam naoyam commented Oct 2, 2025

ThreadPredicateMap is now used outside of lowering, so it can't assume GpuLower::current() is available. Hit an assertion error while testing #5278

@naoyam
Copy link
Collaborator Author

naoyam commented Oct 2, 2025

!test --diff

@github-actions
Copy link

github-actions bot commented Oct 2, 2025

Review updated until commit 0a1a7de

Description

  • Fixed assertion error when GpuLower::current() is unavailable

  • Made ThreadPredicateMap usable outside of lowering context

  • Improved code clarity with updated comments and TODO note

  • Initialized stride using fusion's oneVal safely


Changes walkthrough 📝

Relevant files
Bug fix
thread_predicate.cpp
Remove GpuLower::current dependency and harden logic         

csrc/device_lower/analysis/thread_predicate.cpp

  • Replaced GpuLower::current()->kernel()->oneVal() with safer
    fusion-based access
  • Added non-empty check for merged_logical_domains using NVF_ERROR
  • Updated comments to clarify function purpose and future improvements
  • Added TODO about optimizing predicate generation preparation
  • +10/-2   

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🧪 No relevant tests
    ⚡ Recommended focus areas for review

    Possible Issue

    The function getIndexOfBroadcastLogicalDomains now uses merged_logical_domains.front()->fusion()->oneVal() instead of GpuLower::current()->kernel()->oneVal(). This change assumes that merged_logical_domains is non-empty, enforced by NVF_ERROR. However, the error check should be placed earlier to ensure safety, especially since this function may be used in contexts where the input vector could be empty.

    NVF_ERROR(!merged_logical_domains.empty());
    logical_stride.at(ndim - 1) =
        merged_logical_domains.front()->fusion()->oneVal();

    @naoyam naoyam requested a review from jjsjann123 October 3, 2025 03:23
    Copy link
    Collaborator

    @jjsjann123 jjsjann123 left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    code change looks straightforward.

    Asking some question about the context, but shouldn't block the PR going in.

    // TODO: ThreadPredicateMap may be used only for its anlaysis but
    // not for generating predicates. This function is currently called
    // no matter if it's just for the analysis but should be lazily
    // done.
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    This function is currently called no matter if it's just for the analysis but should be lazily done.

    I don't follow that comment. This function doesn't seem to be messing with predicates in my monkey eyes.

    Copy link
    Collaborator Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Updated the comment. Hopefully it's clearer.

    @naoyam
    Copy link
    Collaborator Author

    naoyam commented Oct 10, 2025

    !build

    @naoyam naoyam merged commit fc76733 into main Oct 10, 2025
    14 of 15 checks passed
    @naoyam naoyam deleted the thread_pred_fix branch October 10, 2025 08:43
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    3 participants