Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++] The representation of bounded iterators inhibits Clang vectorization #108600

Open
ldionne opened this issue Sep 13, 2024 · 4 comments
Open
Labels
hardening Issues related to the hardening effort libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Comments

@ldionne
Copy link
Member

ldionne commented Sep 13, 2024

With test code like this:

template <class Span>
__attribute__((noinline)) int testForLoop(Span span) {
  int sum = 0;
  for (auto it = span.begin() + 1; it < span.end(); ++it) {
    sum += *it - *(it - 1);
  }
  return sum;
}

Without bounded iterators, Clang vectorizes this code. When we enable bounded iterators though, Clang isn't able to vectorize anymore.

For information, the current representation of __bounded_iter is as follows:

_Iterator __current_;       // current iterator
_Iterator __begin_, __end_; // valid range represented as [begin, end]

I played around with changing the representation of __bounded_iter to see if that could make any difference, and it looks like the following representation instead allows Clang to vectorize this code:

size_t __current_;       // current index inside [begin, end]
_Iterator __begin_, __end_; // valid range represented as [begin, end]

Funnily enough, it seems that this representation leads to better code than the version that doesn't use bounded iterators!

Godbolt with a pointer (status quo): https://godbolt.org/z/efa77anqK
Godbolt with the index: https://godbolt.org/z/f9n5zs8sv

@ldionne ldionne added libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. hardening Issues related to the hardening effort labels Sep 13, 2024
@ldionne
Copy link
Member Author

ldionne commented Sep 13, 2024

CC @var-const @davidben

@davidben
Copy link
Contributor

Interesting. If you write a much simpler loop, it vectorizes just fine:

  for (auto it = span.begin(); it < span.end(); ++it) {
    sum += *it;
  }

Looking at the generated code, I think the lack of vectorization is a red herring. Notice how there are branches inside the loop that jump to a ud2 instruction. One of the safety checks is not getting optimized out. That is probably what is impeding the vectorization.

I think the issue is that the compiler can't tell that *(it - 1) is okay. Here's a slightly more minimized version with the same problem:
https://godbolt.org/z/c67shTf13
(To ensure an apples-to-apples comparison, I added a span.empty() check because otherwise the safety check in setting up the loop is correct.)

That gives an even more minimized version:
https://godbolt.org/z/YxT9dK1qT

The compiler should delete the safety check, but it's still emitting one. Looks like it's checking for the length being negative, so we might have another variation of #78784? Not positive.

Funnily enough, it seems that this representation leads to better code than the version that doesn't use bounded iterators!

Hmm. #101372 revealed an issue where the compiler's pointer reasoning is worse than the compiler's integer reasoning due to some alignment issues. Perhaps something similar is happening?

@davidben
Copy link
Contributor

(@danakj FYI)

@davidben
Copy link
Contributor

davidben commented Sep 13, 2024

Actually my last minimization might not be correct. I think that's a different issue, which is that !span.empty() => span.begin() + 1 isn't getting optimized:
https://godbolt.org/z/EsjW1Y35G

*(it - 1) might be something else, like a loop invariant not being discovered.

One way or another, there seem to be a lot of gaps in the compiler's ability to do bounds-related analysis! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardening Issues related to the hardening effort libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

No branches or pull requests

2 participants