Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Optimization] Don't generate bad_variant_access logic for exhaustive visit using overloaded #74

Open
bhamiltoncx opened this issue Feb 28, 2020 · 8 comments

Comments

@bhamiltoncx
Copy link

bhamiltoncx commented Feb 28, 2020

Using clang from trunk with -std=c++17 -Oz, the following code uses overloaded with an exhaustive set of lambdas to visit all cases of the variant:

https://godbolt.org/z/k-r72L

#include <https://raw.githubusercontent.com/mpark/variant/single-header/master/variant.hpp>
#include <functional>
#include <string>
#include <stdio.h>

namespace {
template <class... Ts>
struct overloaded : Ts... {
using Ts::operator()...;
};
template <class... Ts>
overloaded(Ts...) -> overloaded<Ts...>;
}

enum class Foo { kValue };
enum class Bar { kValue };
using Callback = std::function<std::string(void)>;

using SumType = mpark::variant<Foo, Bar, Callback>;

void DoStuff(SumType s) {
  mpark::visit(overloaded{
      [](Foo f) { printf("Foo!\n");},
      [](Bar b) { printf("Bar!\n");},
      [](const Callback& c) { printf("Callback: %s\n", c().c_str()); },
  },
  s);
} 

Even though the lambdas exhaustively handle all cases, MPark.Variant still generates error handling logic to throw bad_variant_access exceptions (which as far as I can tell should never happen).

I think it'd be a nice optimization to drop the error handling entirely if the visitation is exhaustive.

Here's the current state of the optimized assembly in case this gets fixed:

DoStuff(mpark::variant<Foo, Bar, std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()> >): # @DoStuff(mpark::variant<Foo, Bar, std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()> >)
        push    r14
        push    rbx
        sub     rsp, 40
        mov     rbx, rdi
        cmp     byte ptr [rdi + 32], -1
        lea     rdi, [rsp + 7]
        sete    byte ptr [rdi]
        push    1
        pop     rsi
        call    mpark::detail::any(std::initializer_list<bool>)
        test    al, al
        jne     .LBB0_8
        movzx   ecx, byte ptr [rbx + 32]
        cmp     rcx, 255
        push    -1
        pop     rax
        cmovne  rax, rcx
        cmp     rax, 2
        je      .LBB0_6
        cmp     rax, 1
        je      .LBB0_5
        mov     edi, offset .Lstr
        jmp     .LBB0_4
.LBB0_5:
        mov     edi, offset .Lstr.4
.LBB0_4:
        call    puts
        jmp     .LBB0_7
.LBB0_6:
        lea     r14, [rsp + 8]
        mov     rdi, r14
        mov     rsi, rbx
        call    std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const
        mov     rsi, qword ptr [r14]
        mov     edi, offset .L.str.3
        xor     eax, eax
        call    printf
        mov     rdi, r14
        call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() [base object destructor]
.LBB0_7:
        add     rsp, 40
        pop     rbx
        pop     r14
        ret
.LBB0_8:
        call    mpark::throw_bad_variant_access()
mpark::detail::any(std::initializer_list<bool>): # @mpark::detail::any(std::initializer_list<bool>)
        xor     eax, eax
.LBB1_1:                                # =>This Inner Loop Header: Depth=1
        cmp     rsi, rax
        je      .LBB1_4
        cmp     byte ptr [rdi + rax], 0
        lea     rax, [rax + 1]
        je      .LBB1_1
        mov     al, 1
        ret
.LBB1_4:
        xor     eax, eax
        ret
mpark::throw_bad_variant_access():  # @mpark::throw_bad_variant_access()
        push    rax
        push    8
        pop     rdi
        call    __cxa_allocate_exception
        mov     qword ptr [rax], offset vtable for mpark::bad_variant_access+16
        mov     esi, offset typeinfo for mpark::bad_variant_access
        mov     edx, offset std::exception::~exception() [base object destructor]
        mov     rdi, rax
        call    __cxa_throw
mpark::bad_variant_access::~bad_variant_access() [deleting destructor]:      # @mpark::bad_variant_access::~bad_variant_access() [deleting destructor]
        push    rbx
        mov     rbx, rdi
        call    std::exception::~exception() [base object destructor]
        mov     rdi, rbx
        pop     rbx
        jmp     operator delete(void*)                  # TAILCALL
mpark::bad_variant_access::what() const:  # @mpark::bad_variant_access::what() const
        mov     eax, offset .L.str
        ret
std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const: # @std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const
        push    rbx
        cmp     qword ptr [rsi + 16], 0
        je      .LBB5_2
        mov     rbx, rdi
        call    qword ptr [rsi + 24]
        mov     rax, rbx
        pop     rbx
        ret
.LBB5_2:
        call    std::__throw_bad_function_call()
typeinfo name for mpark::bad_variant_access:
        .asciz  "N5mpark18bad_variant_accessE"

typeinfo for mpark::bad_variant_access:
        .quad   vtable for __cxxabiv1::__si_class_type_info+16
        .quad   typeinfo name for mpark::bad_variant_access
        .quad   typeinfo for std::exception

vtable for mpark::bad_variant_access:
        .quad   0
        .quad   typeinfo for mpark::bad_variant_access
        .quad   std::exception::~exception() [base object destructor]
        .quad   mpark::bad_variant_access::~bad_variant_access() [deleting destructor]
        .quad   mpark::bad_variant_access::what() const

.L.str:
        .asciz  "bad_variant_access"

.L.str.3:
        .asciz  "Callback: %s\n"

.Lstr:
        .asciz  "Foo!"

.Lstr.4:
        .asciz  "Bar!"
@bhamiltoncx
Copy link
Author

I just learned about valueless_by_exception:

https://en.cppreference.com/w/cpp/utility/variant/valueless_by_exception

so I assume this logic is to implicitly deal with that case — if so, we can close this issue.

@bhamiltoncx
Copy link
Author

bhamiltoncx commented Feb 28, 2020

If it is possible to detect that these two cases from the spec are noexcept for all the types in the variant (maybe using the noexcept() operator — this would only work for C++17, since the noexcept specification is not part of the function type until then), we could probably elide the bad_variant_access logic in this case.

  • (guaranteed) an exception is thrown during the move initialization of the contained value from the temporary in copy assignment
  • (guaranteed) an exception is thrown during the move initialization of the contained value during move assignment

@falemagn
Copy link

falemagn commented Feb 16, 2021

I believe the problem lies in the fact that any type that has a conversion operator to one of the types the variant holds can throw an exception during the conversion, and since that's something that happens independently from the visitation, I can't see how visit would possibly detect that.

One such examples is given by cppreference itself:

struct S {
    operator int() { throw 42; }
};
std::variant<float, int> v{12.f}; // OK
v.emplace<1>(S()); // v may be valueless

@bhamiltoncx
Copy link
Author

bhamiltoncx commented Feb 16, 2021 via email

@falemagn
Copy link

falemagn commented Feb 17, 2021

Could we only do this optimization when exceptions are disabled?

Even though they're disabled in the current compilation unit, they might not haven been disabled in the compilation unit that handed you the variant you want to visit, so I guess that's not a check that can be performed.

However, a recent Twitter discussion brought to my attention an optimization that has been performed in libstdc++: trivially copyable types aren't able to now put the variant in a valueless state. If mpark's variant did the same thing, then mpark's visit could effectively perform the optimization you suggested, but only for those variants that are made of only trivially copyable types.

Here's the discussion: https://twitter.com/BarryRevzin/status/1361826113543110657

@bhamiltoncx
Copy link
Author

bhamiltoncx commented Feb 18, 2021 via email

@bhamiltoncx
Copy link
Author

bhamiltoncx commented Feb 18, 2021 via email

@bhamiltoncx
Copy link
Author

bhamiltoncx commented Feb 18, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants