Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate throw and rethrow #113

Closed
RossTate opened this issue Jun 8, 2020 · 24 comments
Closed

Consolidate throw and rethrow #113

RossTate opened this issue Jun 8, 2020 · 24 comments

Comments

@RossTate
Copy link
Contributor

RossTate commented Jun 8, 2020

I think the original reasons for keeping these as distinct concepts have sort of dissolved(?). (I have no idea what the right word to use is.)

We should have just new_exn or alloc_exn or something of the like, and throw but taking an exnref.

If it helps prove the point here's how you can implement new_exn $exception now:

try { throw $event } catch {}
@rossberg
Copy link
Member

rossberg commented Jun 8, 2020

No, they haven't. Rethrow operates on an exnpackage, which generally contains more information than just the exception value. In practice, e.g. a stack trace that is copied over on rethrow, but newly generated on throw.

More interestingly, if you were to add resumption, then the exnpackage would carry a continuation, and that is also only captured by throw, not by rethrow. So they are fundamentally different operations.

@RossTate
Copy link
Contributor Author

RossTate commented Jun 8, 2020

In practice, e.g. a stack trace that is copied over on rethrow, but newly generated on throw.

WebAssembly does not have a stack-trace semantics, nor does C++, nor does JavaScript. Java, C#, and Python do have stack-trace semantics, and all of them are different. There is no practice here; it's all ad hoc. Regardless, there's nothing preventing my revised throw from preserving the stack trace.

More interestingly, if you were to add resumption, then the exnpackage would carry a continuation, and that is also only captured by throw, not by rethrow.

As you also noted twice, you can't use this design for resumable exceptions. The C++ destructors on the stack can already have executed by the time the handler is reached.

But now I realize it doesn't matter anyways. My suggestion is equivalent to the current proposal, as in the two can encode the other. But it has a clearer separation of functionality: one instruction for creating the exception, and one exception for raising the exception.

@aheejin
Copy link
Member

aheejin commented Jun 9, 2020

I think this is mostly a choice of flavor. We considered splitting those instructions too, and ended up with the current throw for simplicity (and one less instruction). Not sure if it's worth revisiting this unless there's a serious issue.

@rossberg
Copy link
Member

rossberg commented Jun 9, 2020

There is no practice here; it's all ad hoc.

Agreed it's all ad-hoc, but it still is rather relevant practice. This has been an actual pain point in JS engines, for example, where the language fails to make this distinction, and there is no viable implementation strategy that does not loose stack traces in some cases.

Regardless, there's nothing preventing my revised throw from preserving the stack trace.

I think you could only do this by either mutating the exception value, i.e., expose the same hacky behaviour as JS, or worse, by mutating some global state.

you can't use this design for resumable exceptions

Probably true, but consider these two as examples of why it's better to assume that there is a conceptual difference between the two operations. And hence that it is more conservative to separate them.

@RossTate
Copy link
Contributor Author

@aheejin, your parenthetical makes me think there's a misunderstanding. I'm suggesting replacing two instructions, throw and rethrow, with two instructions, new_exn and throw (where the latter is just rethrow renamed). So neither version has fewer instructions.

I think you could only do this by either mutating the exception value, i.e., expose the same hacky behaviour as JS, or worse, by mutating some global state.

@rossberg, you can collect the trace when you execute new_exn and store it once-and-forall in the new exnref. throw doesn't touch the stack trace. This is the same semantics given by the translation of these instructions into the current proposal.

I do not follow your last argument.

@aheejin
Copy link
Member

aheejin commented Jun 10, 2020

@aheejin, your parenthetical makes me think there's a misunderstanding. I'm suggesting replacing two instructions, throw and rethrow, with two instructions, new_exn and throw (where the latter is just rethrow renamed). So neither version has fewer instructions.

What I meant by fewer instructions is in terms of the number of instructions used in real code. When you throw an exception, now we can do it with a single throw, whereas in your version we need two instructions to achieve the same thing.

This difference may not be very significant in terms of final code size given that throwing exceptions is rare, but it's still simpler. That's the reason why I said this is mostly a choice of flavor.

@aheejin aheejin closed this as completed Jun 10, 2020
@aheejin aheejin reopened this Jun 10, 2020
@aheejin
Copy link
Member

aheejin commented Jun 10, 2020

Sorry I was going to click "Comment" button I accidentally clicked "Close and comment" button. Reopened the issue.

@RossTate
Copy link
Contributor Author

Ah, thanks for the clarification!

@rossberg
Copy link
Member

@RossTate:

I think you could only do this by either mutating the exception value, i.e., expose the same hacky behaviour as JS, or worse, by mutating some global state.

@rossberg, you can collect the trace when you execute new_exn and store it once-and-forall in the new exnref. throw doesn't touch the stack trace.

But the stack trace ought to be captures at the throw point, not the point where you create the value, which might be somewhere else entirely. Likewise, you may want to amend the trace upon rethrow, because again that may happen somewhere else entirely than where the catch was.

I do not follow your last argument.

We can think of two examples of meta data that one might include into exception packages (traces, continuations). For both we already know that the suggested approach cannot generally produce the correct behaviour. That's a strong indication to be careful and not conflate two operations that behave differently.

@RossTate
Copy link
Contributor Author

But the stack trace ought to be captures at the throw point

I don't understand how you can make statements about what stack-trace semantics ought to be. Different languages have different semantics (including collect the stack trace at the point of allocation). WebAssembly should not be prescribing which of these is correct; it should be providing infrastructure to support each of these semantics and let the language implementer choose which to use. (And I doubt either of us have enough knowledge of the debugging literature to know which semantics is best anyways, supposing any best practice has even been established.)

We can think of two examples of meta data that one might include into exception packages (traces, continuations).

As mentioned above, neither of these pan out. If a host chooses to associate a stack trace for some internal purpose, however it does so is guaranteed to be arbitrary. If you were to associate a continuation, you wouldn't want to use it because a number of the destructors on the stack have generally been executed already. And regardless, even if you did do these two things, the translation I gave above is still semantics preserving (assuming program equivalences that I think we both agreed we expected to hold in other conversations).

That's a strong indication to be careful and not conflate two operations that behave differently.

@aheejin gave an observable difference between the suggestion above and the status quo, and an argument based on that difference as to why to prefer the status quo. @rossberg, as I believe you are the originator of this design, can you give a similar observable difference (or a useful concrete extension in which this difference matters) to justify this claim?

@rossberg
Copy link
Member

rossberg commented Jun 15, 2020

@RossTate, one main motivation for the current design was that we wanted to keep the door open for resumption. As mentioned, throw and rethrow are very different primitives in the presence of resumption. Moreover, the return type of a throw depends on the exception constructor. It is more natural to have throw applied to a constructor, symmetrically determining both in and out type (this is standard in effect systems, too). You could still separate exception construction and throw, but that requires more tedious typing machinery to track the resumption type.

In short, if you ever want resumption, then separating rethrow is a must, and not separating exn creation is more natural.

I think we also wanted to avoid introducing an extra type of exception values, which again raises issues about life time management etc.

If you were to associate a continuation, you wouldn't want to use it because a number of the destructors on the stack have generally been executed already.

No, as I pointed out before, that's not how destructors (or any finalizers) can work in the presence of resumable exceptions. They must not run when a resumable exception is thrown, that would be a broken semantics. Instead, a separate abort/finalize primitive is needed (which could be provided by means of injecting an unresumable exception, as in our design).

@RossTate
Copy link
Contributor Author

Okay, it seems it will help clarify things if you explain how you plan to address the following:

event $res : [i32] -> [i32]
event $err : [i32];

try {
    do some constructor;
    try {
         if (flip_some_coin()) {
             throw $res (i32.const 1);
         } else {
             throw $err (i32.const 5);
         }
    } catch {
         do some destructor;
         rethrow caught exnref;
    }
    do some destructor;
} catch {
    if (flip_some_coin()) {
        resume caught exnref with (i32.const 5);
    }
}

Suppose the try/catch destructor code in the middle is really in some other module than the innermost and outermost code. How is it supposed to know that 1) the exnref is resumable 2) that the exnref will be resumed?

aheejin pushed a commit to aheejin/exception-handling that referenced this issue Jun 15, 2020
Refactor segment representation in AST (in both spec and interpreter) by separating out a `segment_mode`.

Other changes in Spec:
- Various fixes to text format grammar of segments.
- Factor out elemkind in binary format.
- Add note about possible future extension of element kinds.
- Add note about interpretation of segment kinds as bitfields.
- Fix some cross references.

Other changes in Interpreter:
- Rename {table,memory}_segment to {elem,data}_segment.
- Some rename elem to elem_expr and global.value to global.ginit for consistency with spec.
- Decode the elem segment kind as vu32 to maintain backwards compat.
- Some code simplifications / beautifications.
@rossberg
Copy link
Member

As I pointed out before, you have to distinguish a try instruction with resumption and try without. Each can only catch exceptions of the corresponding kind. They also have different types, because one provides a continuation, while the other doesn't. So you'll need to be more specific which try is which in your example. A (correct) handler for resumable exceptions ought to be linear in the exnref, i.e., it either resumes it or explicitly aborts it.

@RossTate
Copy link
Contributor Author

RossTate commented Jun 16, 2020

Okay, so it sounds like your plan is to effectively have a new kind of throw that ignores existing catches and instead only uses a new kind of catch, and yet you're saying this new kind of throw and new kind of catch should still use the existing exnref despite having the same kind of behavior. This sounds like creating an entirely new exception-handling design but then cramming its values into the preexisting type in order to call it an extension to the existing design. On top of that, you've added linear typing to patch the problems with using continuations (and not discussed what happens when, say, a call made in the handler throws an exception while the linear reference to the exnref being handled is still live).

This is a lot of complexity and uncertainty just to claim that throw and rethrow are substantially different.

@RossTate
Copy link
Contributor Author

Actually, I'm not sure there's a way to make the abort phase work for my best guess as to what you have in mind. But rather than lay out the concern based on a guess, per your request elsewhere I'll first ask you to lay out in more detail what you have in mind.

@rossberg
Copy link
Member

When throwing a resumable exception, anything on the stack is still live, so invoking destructors would obviously be unsound. There isn't much leeway. Ordinary exceptions must unwind the stack, resumable ones must not.

Our original idea was to introduce the distinction simply by annotating exception definitions, throw and try instructions, and the exnref type with a resumable flag. (There are additional reasons for this, such as a resumable try having to allocate a new stack, which is costly.) Linearity wouldn't be tracked in the type system, of course, but it's the producers' responsibility to use continuations in a linear manner.

The simplest way to "abort" a continuation and unwind the stack is to inject a regular exception at the resumption point. So there would be a resume_throw instruction to do just that.

But as you say, and as I pointed out during my Feb presentation, this ultimately leads to two almost disjoint sets of primitives. Which is why we don't propose this design anymore. Yet I wouldn't disregard the insights gained from this design experiment about the nature of exceptions, esp that initiating an exception and forwarding an exception are different operations (which also shows up in implementations, FWIW; C++ makes the distinction explicit as well).

@RossTate
Copy link
Contributor Author

It sounds like insisting these are different operations accidentally misled you and others to think this design was more extensible than it is. As a consequence, we're now in a position where we need something as heavyweight as continuations just to implement exception filtering and various other features that without this backwards-compatibility constraint would otherwise only need two-phase exception handling.

C++ makes the distinction explicit as well

Actually, the only distinction in C++ between throw; and throw e; (where e is the caught exception) is that the latter uses the copy constructor. Some months ago I even checked to see what a few implementations reported for stack traces, and if they reported anything they only reported the stack trace from throw; onwards, not the stack trace from where the exception was originally thrown. So, a C++ throw; corresponds more accurately to throw in the current proposal than to rethrow. This is another sign that the design of the current proposal was misinformed by misunderstandings of low-level exception handling.

@rossberg
Copy link
Member

It sounds like insisting these are different operations accidentally misled you and others to think this design was more extensible than it is. As a consequence, we're now in a position where we need something as heavyweight as continuations just to implement exception filtering and various other features that without this backwards-compatibility constraint would otherwise only need two-phase exception handling.

I don't see how the throw/rethrow distinction implies anything like that.

@aheejin
Copy link
Member

aheejin commented Aug 17, 2020

I'm wondering how this discussion might change in light of recent developments. The new version of resumable EH proposal (WebAssembly/design#1359) does not have evtref anymore and AFAICT it does not have a concept of something that resembles rethrowing, right?

And also in #123 we discussed removal of exnref. Then rethrow instruction will be like something in its first version of the proposal; it does not take an exnref argument, and it can only occur within a catch body.

In the first proposal, it used to take an immediate argument, which specifies which exception in the current EH stack to rethrow. For example,

try
  ...
catch  ;; (1)
  try
    ...
  catch  ;; (2)
    rethrow 0  ;; rethrows exception caught by (2)
    rethrow 1  ;; rethrows exception caught by (1)
  end
end

Then we have two questions to answer:

  1. Do we still need rethrow instruction (which does not take an exnref and occurs only within a catch body)?

I think the original rationale of rethrow that it can embed more info such as stack traces, is still valid. It may lose its importance in the presence of two-phase unwinding. But some languages may still want to embed their stack traces within an exception object, if they choose a similar way to JS. Also, even if we end up supporting two-phase unwinding, the EH proposal will likely to be split into two proposals, and the MVP EH proposal may not contain the two-phase unwinding capability yet.

  1. If we keep rethrow, should we restore its immediate argument in the first proposal too?

I'm little unsure where we are going to use this functionality, and I'm not sure if this will be compatible with two-phase unwinding. But I might be mistaken.

What do you think?


I also considered a possibility of changing catch instruction more like cont.resume, so that we merge all catch instructions into one. For example,

try
  ...
catch $exn0 $label0 $exn1 $label1, ...

which means, if an exception $exn0 is caught, we branch to $label0, and if $exn1 is caught, we branch to $label1, and so on.

But as @rossberg pointed out, this has two problems:

  1. How to represent catch_all?

I think we can make a rule such as the (optional) last label as catch_alls target label, or something. I'm not very sure how easy this is to encode in binary though.

  1. What we should do with rethrow? Now we don't have exnref and rethrow should occur within a catch body. But with this version catch bodies don't exist.

I don't know an answer to this. If we keep rethrow, I don't think we can make catch look like cont.resume, and I'm fine with that. This is just some bikeshedding after all.

@RossTate
Copy link
Contributor Author

For designing catch_all, it would help to know what its use cases are. As a non-example, the C++ catch (...) construct would probably not compile to catch_all; it's still supposed to support operations like std::current_exception that need a pointer to the exception object, and so it would probably compile to catch $cpp_exception.

@aheejin
Copy link
Member

aheejin commented Aug 17, 2020

For designing catch_all, it would help to know what its use cases are. As a non-example, the C++ catch (...) construct would probably not compile to catch_all; it's still supposed to support operations like std::current_exception that need a pointer to the exception object, and so it would probably compile to catch $cpp_exception.

The spec says catch (...) catches foreign exceptions. So I think catch (...) should be compiled to catch_all after all. (catch (...) in the current implementation is missing that part, which means it does not catch foreign exceptions yet. It was mostly for ease of development in the prototype.) But anyway, catch (...) was not the reason we introduced catch_all in the first version of the proposal.

IIRC catch_all was first introduced to give wasm modules generated from any language a capability to handle foreign exceptions they don't know about. So for example, wasm module A, compiled from languageA, may want to catch foreign exceptions and print some error message, even if module A does not understand internals of the exception.

Also it was used to run destructors too. But if we add unwind this may not be necessary.

Not sure what you mean by std::current_exception. FYI it returns null for foreign exceptions.

@RossTate
Copy link
Contributor Author

IIRC catch_all was first introduced to give wasm modules generated from any language a capability to handle foreign exceptions they don't know about.

So the issue is that the surface language needs some catch-all construct for this to be generated in the first place. That's why I brought up catch(...).

The spec says catch (...) catches foreign exceptions.

Technically, that spec is about a particular ABI, not C++ in general. Googling around, it seems that different compilers/systems give different semantics to catch (...). Some have it ignore foreign exceptions entirely, and some have it even catch system errors. In fact, Visual Studio used to have them catch SEH exceptions, but changed them to not (unless you set a specific flag), seemingly because it was causing problems for applications. So the question is: what is a particular application using catch (...) expecting?

FYI it returns null for foreign exceptions.

Ah, useful information. Thanks! (Though, again, this might be specific to Itanium.)


Just to clarify, I'm not opposed to catch_all. But it does introduce a number of questions about design choices and feature interactions. For example, should it catch traps? Traps get turned into JS exceptions, which presumably would be caught by catch_all. So my inclination is to leave it out of the EH proposal for now, both so that EH doesn't get delayed by these deliberations, and so that we can have time to investigate the use cases and come up with a good design for them.

As a separate thought, I'm wondering if it would be worthwhile to develop a WebAssembly-specific ABI for C++.

@aheejin
Copy link
Member

aheejin commented Aug 17, 2020

Technically, that spec is about a particular ABI, not C++ in general. Googling around, it seems that different compilers/systems give different semantics to catch (...). Some have it ignore foreign exceptions entirely, and some have it even catch system errors. In fact, Visual Studio used to have them catch SEH exceptions, but changed them to not (unless you set a specific flag), seemingly because it was causing problems for applications. So the question is: what is a particular application using catch (...) expecting?

Then it is up to our tool convention whether we should catch foreign exceptions with catch (...). Some particular applications we talked about, IIRC, were something like what I mentioned above; printing or handling unexpected foreign exceptions by printing error messages or something.

Just to clarify, I'm not opposed to catch_all. But it does introduce a number of questions about design choices and feature interactions. For example, should it catch traps? Traps get turned into JS exceptions, which presumably would be caught by catch_all.

Traps should not be caught by catch_all. It was what we decided after long discussions; even if it turns into a class of JS exception, we let them have a specific internal flag so we can exclude them. Please see #93 for details.

So my inclination is to leave it out of the EH proposal for now, both so that EH doesn't get delayed by these deliberations, and so that we can have time to investigate the use cases and come up with a good design for them.

I'm nervous about removing one of the functionalities we had from the beginning. And as I said, I don't think the argument about its uses (handling foreign exceptions appropriately or gracefully) has changed. Also given that we are already likely to divide the EH proposal into two steps if we decide to support two-phase unwinding, I'm not sure if I want to divide the EH proposal further. I wouldn't want something like 4 different steps for the EH proposal. Also, even though we don't consider that as a part of this proposal, we are likely to have a related proposal on resumable exceptions (or stack switching or anything in that vein).

As a separate thought, I'm wondering if it would be worthwhile to develop a WebAssembly-specific ABI for C++.

We have some of that in https://github.com/WebAssembly/tool-conventions. EH scheme doc there is not very up-to-date; it was written like 2-3 years ago and it does not reflect the current status of implementation, so please don't consult it. But this is the repo we should put any ABI stuff specific to wasm.

@RossTate
Copy link
Contributor Author

Oh wait, I just remembered that I had thought through this. You can implement catch_all with try/unwind. Any unwind clause can cut the unwinding process short by simply branching to a label outside the unwind clause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants