Consider dropping exception_hacks.cc #2479

gleb-cloudius · 2024-10-08T12:04:25Z

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744 is closed as fixed, so theoretically we do not need src/core/exception_hacks.cc for scalability any more. Need to wait until the fix propagate to all supported distributions.

avikivity · 2024-10-21T16:02:17Z

I tested a 2s96c192t machine. Results are in throws/second/thread:

Fedora 40 / gcc 14: 691798

Fedora 39 / gcc 14: 685662

Fedora 37 / gcc 12: 674885

Fedora 35 / gcc 11: 953

gcc 11 was released in 2021 and while we typically support two compilers back, the fix is in libgcc_s which might be older yet. But I think we can flip the default to off, in order to make use of the exception cache.

avikivity · 2024-10-21T16:03:09Z

#include <thread>
#include <algorithm>
#include <vector>
#include <exception>
#include <chrono>
#include <latch>
#include <iostream>

using namespace std::chrono_literals;

int main(int ac, char** av) {
    auto nr_threads = std::max(1u, std::thread::hardware_concurrency());
    std::vector<std::jthread> threads;
    std::latch before(nr_threads+1);
    std::atomic_flag stop = false;
    std::atomic<uint64_t> result = 0;
    auto duration = 10s;
    for (int i = 0; i < nr_threads; ++i) {
        threads.push_back(std::jthread([&] {
            before.arrive_and_wait();
            uint64_t local_result = 0;
            while (!stop.test(std::memory_order_relaxed)) {
                try {
                    throw std::runtime_error("ffF");
                } catch (...) {
                }
                ++local_result;
            }
            result.fetch_add(local_result);
        }));
    }
    before.arrive_and_wait();
    std::this_thread::sleep_for(duration);
    stop.test_and_set();
    threads.clear();
    std::cout <<  result / (nr_threads * (duration / std::chrono::duration<double>(1s)))
              << " (" << nr_threads << ")" << std::endl;
    return 0;
}

In [1], Seastar started to bypass a lock in libgcc's exception throwing mechanism to allow scalability on large machines. The problem is documented in [2] and reported as fixed. In [3], testing results on a 2s96c192t machine are reported. The problem appears indeed fixed with gcc 14's runtime (which we use, even though we build with clang). Given the new results, we can safely drop the exception scalability hack. As [1] states that the hack causes the loss of a translation cache, we may gain some performance this way. With that, we disable the cache by defining some random macros. [1] https://github.com/scylladb/seastar/464f5e3ae43b366b05573018fc46321863bf2fae [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744 [3] scylladb/seastar#2479 (comment)

tchaikov · 2024-10-22T03:46:28Z

the change in libgcc was gcc-mirror/gcc@6e80a1d#diff-1f5777bd6a36561114c717d34945bbf56fdfb0cc850b7da85782716c81354326R131 .

despite that we can use

$ clang -print-libgcc-file-name
/usr/lib/gcc/x86_64-redhat-linux/14/libgcc.a

and

$ clang --rtlib=compiler-rt -print-libgcc-file-name
/home/kefu/.local/lib/clang/20/lib/linux/libclang_rt.builtins-x86_64.a

to print out the path to libgcc or the runtime used by the compiler

as the static library is but an archive, which fails to encode the interesting symbols about versioning. i am wondering if we can check libgcc_s.so instead if libgcc.a is reported by ${CMAKE_C_COMPILER} using, for instance readelf to check for its symbol versioning, and if it contains versioning higher than 13.0.0, we should consider it safe, and flip the default?

In [1], Seastar started to bypass a lock in libgcc's exception throwing mechanism to allow scalability on large machines. The problem is documented in [2] and reported as fixed. In [3], testing results on a 2s96c192t machine are reported. The problem appears indeed fixed with gcc 14's runtime (which we use, even though we build with clang). Given the new results, we can safely drop the exception scalability hack. As [1] states that the hack causes the loss of a translation cache, we may gain some performance this way. With that, we disable the cache by defining some random macro. [1] https://github.com/scylladb/seastar/464f5e3ae43b366b05573018fc46321863bf2fae [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744 [3] scylladb/seastar#2479 (comment)

avikivity · 2024-10-22T10:38:50Z

I don't think it's so important. We can keep it like this for a year or so, then remove it completely.

In [1], Seastar started to bypass a lock in libgcc's exception throwing mechanism to allow scalability on large machines. The problem is documented in [2] and reported as fixed. In [3], testing results on a 2s96c192t machine are reported. The problem appears indeed fixed with gcc 14's runtime (which we use, even though we build with clang). Given the new results, we can safely drop the exception scalability hack. As [1] states that the hack causes the loss of a translation cache, we may gain some performance this way. With that, we disable the cache by defining some random macro. [1] https://github.com/scylladb/seastar/464f5e3ae43b366b05573018fc46321863bf2fae [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744 [3] scylladb/seastar#2479 (comment) Closes #21217

gleb-cloudius assigned gleb-cloudius, nyh and avikivity Oct 8, 2024

avikivity mentioned this issue Oct 21, 2024

build: disable Seastar exception hack scylladb/scylladb#21217

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider dropping exception_hacks.cc #2479

Consider dropping exception_hacks.cc #2479

gleb-cloudius commented Oct 8, 2024

avikivity commented Oct 21, 2024

avikivity commented Oct 21, 2024

tchaikov commented Oct 22, 2024 •

edited

Loading

avikivity commented Oct 22, 2024

Consider dropping exception_hacks.cc #2479

Consider dropping exception_hacks.cc #2479

Comments

gleb-cloudius commented Oct 8, 2024

avikivity commented Oct 21, 2024

avikivity commented Oct 21, 2024

tchaikov commented Oct 22, 2024 • edited Loading

avikivity commented Oct 22, 2024

tchaikov commented Oct 22, 2024 •

edited

Loading