Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Posix remote_address() may throw, but it is marked noexcept #2379

Open
absporl opened this issue Jul 30, 2024 · 7 comments
Open

Posix remote_address() may throw, but it is marked noexcept #2379

absporl opened this issue Jul 30, 2024 · 7 comments

Comments

@absporl
Copy link

absporl commented Jul 30, 2024

I ran into the following error:

terminate called after throwing an instance of 'std::system_error'
  what():  getpeername: Transport endpoint is not connected
Aborting on shard 1.

Partial backtrace:

std::terminate() at ??:?
__clang_call_terminate at :?
seastar::net::posix_connected_socket_impl::remote_address() const at ??:?
seastar::tls::tls_connected_socket_impl::remote_address() const at ??:?
seastar::connected_socket::remote_address() const at ??:?

I believe the cause is posix_connected_socket_impl::remote_address() is marked noexcept, but it calls posix_connected_socket_operations::remote_address(), which is not noexcept, and in fact the latter calls file_desc::get_remote_address() which explicitly throws.

See:

https://github.com/scylladb/seastar/blob/master/src/net/posix-stack.cc#L288
https://github.com/scylladb/seastar/blob/master/src/net/posix-stack.cc#L110
https://github.com/scylladb/seastar/blob/master/include/seastar/core/posix.hh#L298

@xemul
Copy link
Contributor

xemul commented Jul 31, 2024

But the class name suggests that the socket is connected, while kernel thinks otherwise. Can you shed more light on this -- how did you end up having unconnected socket wrapped with connected_socket_impl object?

@absporl
Copy link
Author

absporl commented Jul 31, 2024

This happened when I used an external tcpkill process to rapidly terminate connections, for testing purposes. What is supposed to happen to a seastar::connected_socket object when it is no longer connected?

@xemul
Copy link
Contributor

xemul commented Aug 5, 2024

This happened when I used an external tcpkill process to rapidly terminate connections, for testing purposes.

Ah, I see. In that case this can be a nice unit test too -- a local listening posix socket, a local established connection, then server-side closes and client-side tries to call (local_|remote_)address()

What is supposed to happen to a seastar::connected_socket object when it is no longer connected?

It won't transform into some other class, that's for sure. No I understand where the issue could be.

@xemul
Copy link
Contributor

xemul commented Aug 5, 2024

Can you show me your full test? The ENOTCON is returned for SYN_SENT and CLOSE states. The latter doesn't happen until client closes its side, and if it does it's no longer valid to call remote_address() on connected_socket(). So I assume we're in SYN_SENT state, but don't see how it happens until connected_socket is instantiated.

@niekbouman
Copy link
Contributor

niekbouman commented Aug 16, 2024

hi Pavel,
apologies for the delay (I took over the task of making a reproducer from my colleague Mark).

Here is an (ugly) reproducer (no attempts to cleanly shutdown services, just a fairly minimal reproducer to trigger the bug):

I first run, in another terminal, tcpkill, and leave it running:

sudo tcpkill -i lo port 8000

Then I run the following program: (I compiled my test in dev mode)

#include "stop_signal.hh"
#include <seastar/core/abort_source.hh>
#include <seastar/core/app-template.hh>
#include <seastar/core/future.hh>
#include <seastar/core/sleep.hh>
#include <seastar/core/thread.hh>
#include <seastar/core/when_all.hh>
#include <seastar/util/log.hh>

static seastar::logger mylog("main");

using namespace seastar;
using namespace std::chrono_literals;

abort_source as;

future<> client()
{
    ipv4_addr addr(8000);

    connected_socket s = co_await connect(addr);
    auto out = s.output();
    auto in = s.input();

    std::string hello{"hello"};
    co_await out.write(hello);
    co_await out.flush();

    auto payload = co_await in.read();
}

future<> server()
{
    connected_socket srv_sock;
    listen_options opts;
    opts.reuse_address = true;
    opts.set_fixed_cpu(0);
    server_socket socket = listen(make_ipv4_address(8000), opts);

    accept_result ar = co_await socket.accept();

    srv_sock = std::move(ar.connection);

    auto in = srv_sock.input();
    auto out = srv_sock.output();

    while (!as.abort_requested()) {

        if (as.abort_requested()) {
            break;
        }

        uint16_t acc{};
        for (int i = 0; i < 10000; ++i) {
            auto ra = srv_sock.remote_address();
            acc += ra.port();
        }

        mylog.info("port sum: {}", acc);
    }
}

int main(int argc, char** argv)
{
    app_template app;

    return app.run(argc, argv, [] {
        return seastar::async([] {
            seastar_apps_lib::stop_signal s;
            auto server_fut = server();
            auto client_fut = client();
            s.wait().get();
        });
    });
}

When the program is run, tcpkill has output like this:

tcpkill: listening on lo [port 8000]
127.0.0.1:54880 > 127.0.0.1:8000: R 1069955445:1069955445(0) win 0
127.0.0.1:54880 > 127.0.0.1:8000: R 1069955705:1069955705(0) win 0
127.0.0.1:54880 > 127.0.0.1:8000: R 1069956225:1069956225(0) win 0
127.0.0.1:54880 > 127.0.0.1:8000: R 1069955445:1069955445(0) win 0
...

The seastar program then crashes with:

...
INFO  2024-08-16 21:59:17,036 [shard 0:main] main - port sum: 20864
INFO  2024-08-16 21:59:17,039 [shard 0:main] main - port sum: 20864
INFO  2024-08-16 21:59:17,041 [shard 0:main] main - port sum: 20864
INFO  2024-08-16 21:59:17,044 [shard 0:main] main - port sum: 20864
terminate called after throwing an instance of 'std::system_error'
  what():  getpeername: Transport endpoint is not connected
Aborting on shard 0.
Backtrace:
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x345cb4
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x30e904
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x339f36
  /lib64/libc.so.6+0x40cff
  /lib64/libc.so.6+0x994a3
  /lib64/libc.so.6+0x40c4d
  /lib64/libc.so.6+0x28901
  /lib64/libstdc++.so.6+0xa5da8
  /lib64/libstdc++.so.6+0xb7c4b
  /lib64/libstdc++.so.6+0xa5950
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x26314a
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x40d106
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x415c2f
  0x21061a
  0x2113ea
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x31d49f
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x31e627
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x31db08
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x291c69
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x291049
  /home/niek/fedora35/cranmera/dev_build/apps/../../thirdparty/seastar/build/dev/libseastar.so+0x291fb9
  0x20ec20
  /lib64/libc.so.6+0x2a087
  /lib64/libc.so.6+0x2a14a
  0x20d8e4
^CKilled

@absporl
Copy link
Author

absporl commented Aug 30, 2024

@xemul Is this helpful?

@niekbouman
Copy link
Contributor

Can you show me your full test? The ENOTCON is returned for SYN_SENT and CLOSE states. The latter doesn't happen until client closes its side, and if it does it's no longer valid to call remote_address() on connected_socket(). So I assume we're in SYN_SENT state, but don't see how it happens until connected_socket is instantiated.

Polite "ping" @xemul . See reproducer above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants