-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix g++ warnings and always_inline usage #10
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Joshua Henderson <[email protected]>
layout.h: In function ‘lay_vec4 lay_vec4_xyzw(lay_scalar, lay_scalar, lay_scalar, lay_scalar)’: layout.h:231:33: warning: ISO C++ forbids compound-literals [-Wpedantic] return (lay_vec4){x, y, z, w}; Signed-off-by: Joshua Henderson <[email protected]>
On ARM9, telling gcc to always inline with the gcc attribute is causing some random behavior with layout. Just defaulting to the normal inline keyword seems to be fine. Signed-off-by: Joshua Henderson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Sorry this took me so long to get to.
Does this happen with Clang as well, or only GCC?
(The reason I use always_inline
is that it can sometimes allow the compiler to skip inline analysis work, potentially slightly speeding up compilation times.)
Good question. I have not tried clang, only various different versions of g++. |
I don't have any ARM9 hardware to test on, unfortunately. But maybe you can do it. I've created a new branch named (I currently have it hard-coded to the name of the compilers in my test VM, like Once you've set it up, try running:
And it should build and run the battery of tests program for those configs. I've also fixed some of the warnings (in a slightly different fashion, with C++98 compat) that your changes fix. Let me know what the results are. |
Good news - the test picks up the failure on ARM9. Because of cross compiling, it was quicker to skip using tools to compile and run. I just show the command line instead. $ arm-buildroot-linux-gnueabi-g++ --version test branch (e9bc8f2):
test branch (e9bc8f2) + cherry-pick 7c37742:
So, in summary the only difference is commit 7c37742. I think this pretty clearly isolates the issue to a side effect of using always_inline. Note that I do not see this issue on other ARM cores like Cortex-A5 or even on x86_64. It's worth noting if I load up the good case and the bad case in gdb and break at test_layout:430 - all local variables are the exact same value! Except, if I step then only one of the conditions fails. They are operating on the exact same data (or so it appears), yet one of the comparisons fails in LTEST_VEC4EQ(). I looked into this a bit more. Above, in both cases I compiled without any gcc optimization. If I instead turn on seemingly any gcc optimization level (-01, -02, -03) in the failure case, it now magically works just fine. Which, takes me to the gcc definition of the attribute (https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html) for a clue. always_inline OK, so what happens if I take the inline off.
Maybe always_inline is indeed inlining even when gcc would not normally be choosing to do so safely. I didn't go as far as looking at the generated code difference, but I suspect that would reveal the real problem. Maybe an alternate fix is to just remove always_inline when optimization is off? Using always_inline still seems risky here in any event. |
Wow, thanks for the in-depth report! That's pretty interesting. I'm really surprised GCC would continue to inline in an unsafe condition -- it's not supposed to do that. There might be a bug in the Can you try it with undefined behavior and address sanitizer enabled? My |
On test branch as is.
And adding both flags at the same time is also a pass with no errors. So, I think this instrumentation is masking the problem. |
OK, that's kind of interesting. I'm trying to guess at what may be going on. Does your build environment have stuff like stack protection or fortify enabled by default? I'm wondering if doing stuff like enabling address sanitizer or other options is causing the stack frame for that code to change alignment or have it shuffled around, thus exposing or masking the problem depending on what options are enabled or disabled. On most Linux build environments you can disable stack protection (which may or may not be on by default) with something like And you can turn it on explicitly (or make it stronger) with Another thing to try is to disable PIE/PIC: |
I went ahead and root caused it at this point. In an effort to try to isolate the problem, I got it down to a single function. With everything else inlined as usual except this function, the problem goes away.
If I put the inline back the assembly for
is
Now, as a nifty side effect, if I put volatile on the child variable and make no other change, I instead get this assembly (and it masks the problem just like removing always_inline masks the problem and instrumenting the code masks the problem).
If you notice in the first version without volatile (original code which fails), that strd at the end with an offset of 2, is suspicious. Moving into a doubleword from offset 2? That 2 offset, comes directly from the aligned value on the vector type.
My take on what's happening here is the same thing that used to happen with packed on structs often. You get unaligned values and gcc does not handle it properly. The use of vector here is doing the same thing. It's 2 byte aligned and gcc isn't handling it. What gcc would know how to handle correctly is what you did for WIN32.
Or, take the alignment off the lay_vec4. At this point removing always_inline or adding in volatile are not fixing the real problem. They are just covering up the alignment issue for the test cases. |
Great job! That must be it. I think the aligned attribute is not doing what it's supposed to be doing here for gcc:
This seems like a bug in gcc. Either way, But maybe this is something that should be reported? I could also be misunderstanding what this specifier is supposed to do, but it doesn't seem like you should be able to use it to get the compiler to generate bad code in this case. If |
Sorry, another question: Does the misaligned access occur if I'm trying to find a way to easily remove the alignment specifier without having the compiler insert extra padding (which shouldn't be necessary for |
No it does not fail with float. It also does not fail if:
|
Thanks. Sorry I haven't finished this yet -- I'm traveling right now and don't have access to my normal testing environment (a bunch of VMs.) |
I'm back from traveling. I have a way to work around this by reordering two fields and changing the alignment requirement, and it shouldn't add any extra padding. But the memory layout will be different, so hopefully nobody was relying on that not changing. |
nice |
Fix several g++ warnings. Also address the use of the gcc always_inline attribute which misbehaves on at least ARM9.