Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm][wasm-exceptions][c++] Destructors called too early when throwing call is in inline assembly #65116

Open
lambdageek opened this issue Aug 30, 2023 · 8 comments

Comments

@lambdageek
Copy link

lambdageek commented Aug 30, 2023

Repro: https://godbolt.org/z/8W8Eb9nr3

int foo();
int bar();

static
__attribute__((noinline))
int
baz(){
    int x = 0;
#if 1
    __asm(
        "try i32; "
        "call _bar;"
        "catch_all;"
        " rethrow 0;"
        " "
        "end_try;"
        "i32.const 1;"
        "i32.add;"
        "local.set %0;"
    : "=r"(x)
    );
#else
    try {
      x = 1 + bar ();
    } catch (...) {
        throw;
    }
#endif
    return x;
}

class X {
public:
    explicit X() {}
    ~X() {foo();};
};


int square(int num) {
    X x{}; // call to foo() happens here
    return baz();
    // expected the call to foo to happen after baz returns
}

When I compile this with -O -fexceptions -fwasm-exceptions I expect the destructor for x to be called after baz() returns (or when we're unwinding because it throws). Indeed that is the case if you change #if 1 to #if 0 and compile the C++ version of the code. On the other hand when the exception is in the inline asm code, the destructor is called before the call to baz().

Am I doing something wrong? Is there some way to tell clang that the inline wasm may throw?

@llvmbot
Copy link
Collaborator

llvmbot commented Aug 30, 2023

@llvm/issue-subscribers-backend-webassembly

@lambdageek
Copy link
Author

Note that using __asm volatile is only partly correct. In that case, X::~X() gets called after the call to baz, but the call to baz is not wrapped in a try/catch_all

https://godbolt.org/z/Mo17hYf1b

@lambdageek
Copy link
Author

lambdageek commented Aug 31, 2023

Actually I'm not sure this is wasm-specific. I don't see a way with either clang or gcc to write x86-64 inline asm that calls a C++ function that may throw and have that be evident to the C++ compiler.

I would believe "this isn't possible"/"don't do that"/"put the inline asm function in a separate translation unit" as possible resolutions. Feel free to close if that's the case.

@aheejin
Copy link
Member

aheejin commented Sep 2, 2023

Note that using __asm volatile is only partly correct. In that case, X::~X() gets called after the call to baz, but the call to baz is not wrapped in a try/catch_all

https://godbolt.org/z/Mo17hYf1b

Why should the call to baz be wrapped in a try/catch_all? There are two try~catch_alls: One is within baz itself, in the inline assembly, and the other is generated for the destructor of X. The first one cannot contain the call to itself. I also don't understand why the second one should contain the call to baz.

Actually I'm not sure this is wasm-specific. I don't see a way with either clang or gcc to write x86-64 inline asm that calls a C++ function that may throw and have that be evident to the C++ compiler.

I would believe "this isn't possible"/"don't do that"/"put the inline asm function in a separate translation unit" as possible resolutions. Feel free to close if that's the case.

I don't think it's wasm specific either. This happens within the common passes in the middle end. And __asm, without volatile, seems to be assumed not to have side effects. https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile even says "GCC’s optimizers sometimes discard asm statements if they determine there is no need for the output variables".

So yeah I don't think only __asm expected to work. With __asm volatile I guess it should, but not sure why you think it doesn't.

By the way I'm curious, why are you doing this in inline assembly? What's the usecase for this that cannot be done in C++?

@lambdageek
Copy link
Author

Note that using __asm volatile is only partly correct. In that case, X::~X() gets called after the call to baz, but the call to baz is not wrapped in a try/catch_all
https://godbolt.org/z/Mo17hYf1b

Why should the call to baz be wrapped in a try/catch_all? There are two try~catch_alls: One is within baz itself, in the inline assembly, and the other is generated for the destructor of X. The first one cannot contain the call to itself. I also don't understand why the second one should contain the call to baz.

the catch_all in the inlined wasm has a rethrow, so the whole of baz is throwing.

The same example is problematic if you remove all of the try/catch_all/rethrow stuff in baz. It's the call to an unknown function _bar that should be considered potentially throwing.

Actually I'm not sure this is wasm-specific. I don't see a way with either clang or gcc to write x86-64 inline asm that calls a C++ function that may throw and have that be evident to the C++ compiler.
I would believe "this isn't possible"/"don't do that"/"put the inline asm function in a separate translation unit" as possible resolutions. Feel free to close if that's the case.

I don't think it's wasm specific either. This happens within the common passes in the middle end. And __asm, without volatile, seems to be assumed not to have side effects. https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile even says "GCC’s optimizers sometimes discard asm statements if they determine there is no need for the output variables".

So yeah I don't think only __asm expected to work. With __asm volatile I guess it should, but not sure why you think it doesn't.

Everything I'm saying is also true about __asm volatile. the non-volatile version is bad code, I agree.

Here's a full example for linux x64 https://github.com/lambdageek/llvm-65116
build with make then run ./exampleWorks and ./exampleNoDtor

output:

leksey@buran:~/hacking/inline-asm$ ./exampleWorks 
in foo 1 times
in foo 2 times
caught something: throwingFunc called
GOOD: there were 2 calls to countCalls()
aleksey@buran:~/hacking/inline-asm$ ./exampleNoDtor 
in foo 1 times
caught something: throwingFunc called
BAD: there were 1 calls to countCalls(), expected 2

the only difference is whether the call in runMe is coming from outlinedIndirectCaller or indirectCaller (which is a static inline). When the optimizer can see the __asm volatile it deduces that indirectCaller does not throw. When all it sees is a call to outlinedIndirectCaller it is pessimistic and assumes (correctly) that the call is potentially throwing.

I agree this is a hard problem as I don't think gcc or clang really look at the content of asm.

By the way I'm curious, why are you doing this in inline assembly? What's the usecase for this that cannot be done in C++?

Well so my dayjob is to work on a language runtime that runs on WASM. One of the things we're investigating is using the wasm exceptions mechanism for our exceptions. So we would be using a tag other than _cpp_exception. I was playing around on godbolt to see if I could write some wasm assembly to have our C/C++ runtime catch our custom-tagged exceptions. But as a first step I just wanted to see if I could do try/catch/catch_all in inlined asm in clang, so I tried an example with normal C++ exceptions instead of our custom tag.

I think in reality we'd probably just write some .wat directly (so from the point of view of clang it would be like my outlinedIndirectCaller example). But while I was experimenting on godbolt, I happened to use inline assembly and I noticed that the destructor wasn't getting called as many times as I expected.

TL;DR - this isn't a blocking issue for me. and "this is not supported; don't do it like that" is a perfectly reasonable answer.

@aheejin
Copy link
Member

aheejin commented Sep 5, 2023

Note that using __asm volatile is only partly correct. In that case, X::~X() gets called after the call to baz, but the call to baz is not wrapped in a try/catch_all
https://godbolt.org/z/Mo17hYf1b

Why should the call to baz be wrapped in a try/catch_all? There are two try~catch_alls: One is within baz itself, in the inline assembly, and the other is generated for the destructor of X. The first one cannot contain the call to itself. I also don't understand why the second one should contain the call to baz.

the catch_all in the inlined wasm has a rethrow, so the whole of baz is throwing.

The same example is problematic if you remove all of the try/catch_all/rethrow stuff in baz. It's the call to an unknown function _bar that should be considered potentially throwing.

I don't understand. Yes, baz can throw. Why is this the reason that the call to baz should be wrapped in try-catch_all? X::~X() is called after baz and the try-catch_all within X::~X() should NOT wrap the call to baz. The reason there is try-catch_all genererated for X::~X() is nothing to do with baz; it is generated because destructors aren't allowed to throw. Why should call baz() be looped into X::~X()'s body?

I haven't checked your another example yet; I think I need to understand your thoughts on this first.

By the way I'm curious, why are you doing this in inline assembly? What's the usecase for this that cannot be done in C++?

Well so my dayjob is to work on a language runtime that runs on WASM.

Oh, can I ask what runtime is this? Also if you are thinking about implementing EH, please see WebAssembly/exception-handling#280 that we are planning some changes in the spec.

One of the things we're investigating is using the wasm exceptions mechanism for our exceptions. So we would be using a tag other than _cpp_exception. I was playing around on godbolt to see if I could write some wasm assembly to have our C/C++ runtime catch our custom-tagged exceptions. But as a first step I just wanted to see if I could do try/catch/catch_all in inlined asm in clang, so I tried an example with normal C++ exceptions instead of our custom tag.

I think in reality we'd probably just write some .wat directly (so from the point of view of clang it would be like my outlinedIndirectCaller example). But while I was experimenting on godbolt, I happened to use inline assembly and I noticed that the destructor wasn't getting called as many times as I expected.

Custom tags should work. We currently use 0 and 1 for C++ exceptions and C longjmps respectively so other numbers should be fine. But if you really want to do some low level experiments, I recommend using .wat files than LLVM inline assembly. (In .wat files you can define your own tags and use it by something like (tag $tag0 ...) and (throw $tag0 ...))

TL;DR - this isn't a blocking issue for me. and "this is not supported; don't do it like that" is a perfectly reasonable answer.

I generally would not recommend using inline assembly unless it is absolutely unavoidable, more so if you use control flow instructions like block or try. They can mess with block labels and branch immediates because they are not analyzable by the compiler backend.

@lambdageek
Copy link
Author

I don't understand. Yes, baz can throw. Why is this the reason that the call to baz should be wrapped in try-catch_all?

I'm not sure we're talking about the same thing. Suppose baz actually throws an exception. What should happen? The destructor X::~X() should run, right? In the code generated for the OP, the destructor X::~X() will not be called if baz throws. The destructor is only called if baz returns normally. This only happens if baz is in the same translation unit as the caller square. If baz is in a separate translation unit and nothing is known about its body, the code in square is correctly generated so that X::~X() will be called in both cases: if baz returns normally, or if it throws.

X::~X() is called after baz

Yes, if baz returns normally. If baz throws, X::~X() is not called.

and the try-catch_all within X::~X() should NOT wrap the call to baz.

I agree

The reason there is try-catch_all genererated for X::~X() is nothing to do with baz; it is generated because destructors aren't allowed to throw.

Yes I agree with this also.

Why should call baz() be looped into X::~X()'s body?

It should not. The body of X::~X is perfectly fine.

But the body of square should ensure that X::~X() is called even if baz throws.


Here's what you get for square if baz is a C++ function (I also added __attribute__((noinline)) to X::~X() so that we don't get distracted by its contents):

int
baz(){
    int x = 0;
    try {
      x = 1 + bar ();
    } catch (...) {
        throw;
    }
    return x;
}

int square(int num) {
    X x{};
    return baz();
}

We get

square(int):                             # @square(int)
        global.get      __stack_pointer
        i32.const       16
        i32.sub 
        local.tee       1
        global.set      __stack_pointer
        try             
        call    baz()
        local.set       2
        catch_all                               # catch0:
        local.get       1
        global.set      __stack_pointer
        local.get       1
        i32.const       15
        i32.add 
        call    X::~X() [base object destructor]
        drop
        rethrow         0                               # to caller
        end_try                                 # label0:
        local.get       1
        i32.const       15
        i32.add 
        call    X::~X() [base object destructor]
        drop
        local.get       1
        i32.const       16
        i32.add 
        global.set      __stack_pointer
        local.get       2
        end_function

Note there's two places that call X::~X() - the normal control flow calls it at label0: after the end_try. if baz returns normally, the destructor runs and then we return from square.
If baz throws, we end up in catch0 and we call X::~X() before we rethrow.

Now compare with what happens if baz is implemented as inline assembly in the same translation unit as square:

int
baz(){
    int x = 0;
    __asm volatile(
        "try i32; "
        "call _bar;"
        "catch_all;"
        " rethrow 0;"
        " "
        "end_try;"
        "i32.const 1;"
        "i32.add;"
        "local.set %0;"
    : "=r"(x)
    );
    return x;
}

int square(int num) {
    X x{}; // call to foo() happens here
    return baz();
    // expected the call to foo to happen after baz returns
}

In this case we get

square(int):                             # @square(int)
        global.get      __stack_pointer
        i32.const       16
        i32.sub 
        local.tee       1
        global.set      __stack_pointer
        call    baz()
        local.set       2
        local.get       1
        i32.const       15
        i32.add 
        call    X::~X() [base object destructor]
        drop
        local.get       1
        i32.const       16
        i32.add 
        global.set      __stack_pointer
        local.get       2
        end_function

Note that the code for square now calls baz and then calls X::~X(). So if baz doesn't throw this is fine. But baz does potentially throw because it calls _bar about which nothing is known. In that case the call baz() in square does not return, so the destructor X::~X() does not run.

Does that make sense? What matters is the code generated for square based on what clang can see about baz.


Oh, can I ask what runtime is this?

It's Mono's "jiterpreter". Currently we just piggyback on C++ exceptions - the generated code is called from a caller that has a C++ try/catch, and when the generated code needs to throw it invokes a C++ helper that throws a C++ exception. But I was curious if we could use a separate tag. (We would like to avoid the C++ wrapper try/catch and to generate our own try/catchall clauses into the JITed module, and it would be slightly cleaner if we didn't have to ask Emscripten to export the cpp_exception tag and if we just used our own.)

Also if you are thinking about implementing EH, please see WebAssembly/exception-handling#280 that we are planning some changes in the spec.

Yep, we're keeping an eye on the spec changes. Thanks for the heads up.

@aheejin
Copy link
Member

aheejin commented Sep 7, 2023

Oh sorry, yeah now I understand what you're saying. Yes it's weird that baz is assumed to be not throwing, which is not the case. Even if you compile the program with -O0, baz is marked as nounwind, which I think causes the invoke for it to be lowered to a plain call. I think this happens somewhere in clang but am not sure where. But I don't think it's a wasm specific thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants