Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In seastar::http::reply::write_body, using output_stream::write(temporary_buffer) causes crash #1701

Closed
meilofveeningen-rl opened this issue Jun 16, 2023 · 15 comments · Fixed by #2436

Comments

@meilofveeningen-rl
Copy link

When using seastar::http::http_server and when replying to a request in streaming mode using reply::write_body, the following causes a crash:

void fn(seastar::http::reply& rep) {
    rep.write_body("bin", [](seastar::output_stream<char>&& stream_) -> future<> {
        auto stream = std::move(stream_);
        co_await stream.write(seastar::temporary_buffer<char>{3}));
        co_await stream.close();
    }
}

The equivalent code using write(buff.begin, buff.size()) works.

This seems to be because write(temporary_buffer) is implemented in terms of net::packet. In this example, close triggers a flush, which calls seastar::http::internal::http_chunked_data_sink_impl::put(seastar::net::packet), which is not implemented.

There may be a good reason for this, but at least it would be nice if this were documented.

@meilofveeningen-rl
Copy link
Author

@niekbouman
Copy link
Contributor

I can reproduce the problem (see entirely below), and it seems that I can fix the problem by giving an implementation for put(net::packet) (in src/http/common.cc) that extracts the temporary_buffer from the packet again, and then calls the other put overload. But doing it in this way seems a bit silly...

@xemul @nyh Could you perhaps help with this?
The http_chunked_data_sink_impl has been introduced in Pavel's PR 1360, which Nadav reviewed:
#1360 (comment)
Although that PR focused on http-requests, it gives this unintended effect on the reply side.

class http_chunked_data_sink_impl : public data_sink_impl {
    output_stream<char>& _out;

   //...
public:
    http_chunked_data_sink_impl(output_stream<char>& out) : _out(out) {
    }

    virtual future<> put(net::packet data) override { 
        auto v = data.release();
        assert(ssize(v) == 1);
        return this->put(std::move(v[0])); // call the put function defined below
    }

    using data_sink_impl::put;
    virtual future<> put(temporary_buffer<char> buf) override {
        if (buf.size() == 0) {
            // size 0 buffer should be ignored, some server
            // may consider it an end of message
            return make_ready_future<>();
        }
        auto size = buf.size();
        return write_size(size).then([this, buf = std::move(buf)] () mutable {
            return _out.write(buf.get(), buf.size());
        }).then([this] () mutable {
            return _out.write("\r\n", 2);
        });
    }

   //...

This is my reproducer:

#include "stop_signal.hh"
#include <cstdint>
#include <seastar/core/app-template.hh>
#include <seastar/core/coroutine.hh>
#include <seastar/core/future.hh>
#include <seastar/core/seastar.hh>
#include <seastar/core/sleep.hh>
#include <seastar/core/thread.hh>
#include <seastar/http/file_handler.hh>
#include <seastar/http/function_handlers.hh>
#include <seastar/http/handlers.hh>
#include <seastar/http/httpd.hh>
#include <seastar/http/reply.hh>
#include <seastar/json/json_elements.hh>
#include <seastar/net/socket_defs.hh>
#include <seastar/util/log.hh>

seastar::logger logger("main");

class handl : public seastar::httpd::handler_base {
  public:
    virtual seastar::future<std::unique_ptr<seastar::http::reply>> handle(const seastar::sstring& path,
                                                                          std::unique_ptr<seastar::http::request> req,
                                                                          std::unique_ptr<seastar::http::reply> rep)
    {
        rep->_content = "hello";
        rep->done("html");

        rep->write_body("bin", [](seastar::output_stream<char>&& stream_) -> seastar::future<> {
            auto stream = std::move(stream_);
            seastar::temporary_buffer<char> p{3};
            co_await stream.write(std::move(p));
            co_await stream.close();
        });
        return seastar::make_ready_future<std::unique_ptr<seastar::http::reply>>(std::move(rep));
    }
};
int main(int argc, char** argv)
{
    using namespace seastar;
    seastar::app_template app;

    app.run(argc, argv, [] {
        return seastar::async([] {
            seastar_apps_lib::stop_signal s;
            seastar::httpd::http_server server("test");
            server.set_content_streaming(true);

            server._routes.add(seastar::httpd::operation_type::GET, httpd::url("/test"), new handl{});

            server.listen(static_cast<uint16_t>(8000)).get();
            s.wait().get();
            server.stop().get();
        });
    });
}

@xemul
Copy link
Contributor

xemul commented Aug 5, 2024

The same is true for http_content_length_data_sink_impl

@xemul
Copy link
Contributor

xemul commented Aug 5, 2024

And http_content_length_data_sink_impl would just drop the packet :( however, I don't really know under what circumstances this one is used

@xemul
Copy link
Contributor

xemul commented Aug 5, 2024

I think a very good example of how to handle packet in data sink sits in the pip_data_sink_impl:

    future<> put(net::packet data) override {
        return do_with(data.release(), [this] (std::vector<temporary_buffer<char>>& bufs) {
            return do_for_each(bufs, [this] (temporary_buffer<char>& buf) {
                return put(buf.share());
            });
        });
    }

and currently I think that the way to go is to provide default implementation for data_sink_impl::put(net::packet) thing like above, so that inheriting classes can forget about net::packet.

On the other hand I remember some ancient discussion with @avikivity about net::packet vs temporary_buffer usage in iostreams but I forgot what the outcome was :(

@niekbouman
Copy link
Contributor

ah I see, so that goes along the same lines as my temp fix, but then properly passing on all temp_bufs in the packet (instead of only the first)

@niekbouman
Copy link
Contributor

On the other hand I remember some ancient discussion with @avikivity about net::packet vs temporary_buffer usage in iostreams but I forgot what the outcome was :(

Do you still remember that @avikivity , or shall I make a PR that applies Pavel's solution to data_sink_impl::put(net::packet) ?

@xemul
Copy link
Contributor

xemul commented Aug 15, 2024

On the other hand I remember some ancient discussion with @avikivity about net::packet vs temporary_buffer usage in iostreams but I forgot what the outcome was :(

@avikivity , can you please remind us what the idea/plan was?

@avikivity
Copy link
Member

It was a mistake to let net::packet become part of iostream. It's a low-level concept of a scatter/gather packet with headspace for headers, which works well for tcp and network devices, but doesn't fit streams.

However, removing net::packet from iostream is a complicated exercise (involving a year of deprecation) and shouldn't stand in the way of a simple fix.

@avikivity
Copy link
Member

    future<> put(net::packet data) override {
        return do_with(data.release(), [this] (std::vector<temporary_buffer<char>>& bufs) {
            return do_for_each(bufs, [this] (temporary_buffer<char>& buf) {
                return put(buf.share());
            });
        });
    }

Coroutines exist.

@avikivity avikivity reopened this Sep 1, 2024
@avikivity
Copy link
Member

(closed by mistake)

@niekbouman
Copy link
Contributor

I think that the way to go is to provide default implementation for data_sink_impl::put(net::packet) thing like above, so that inheriting classes can forget about net::packet.

Hmm, I now see that data_sink_impl also implements the temporary_buffer overload of put using the packet-overload of put:

    virtual future<> put(temporary_buffer<char> buf) {
        return put(net::packet(net::fragment{buf.get_write(), buf.size()}, buf.release()));
    }

so if we implement the put overload for packet as you suggest, then we will get infinite recursion between the packet and temporary_buffer overloads of put, no?

@xemul
Copy link
Contributor

xemul commented Sep 6, 2024

However, removing net::packet from iostream is a complicated exercise (involving a year of deprecation)

Why not just bumping the API level? It should make this exercise much sorter

and shouldn't stand in the way of a simple fix.

Absolutely yes

@xemul
Copy link
Contributor

xemul commented Sep 6, 2024

so if we implement the put overload for packet as you suggest, then we will get infinite recursion between the packet and temporary_buffer overloads of put, no?

If inheriting class doesn't implement any of those -- yes :( Then maybe

class data_sink_impl {
    ...
protected:
    future<> fallback_put(net::packet data) {
        return do_with(data.release(), [this] (std::vector<temporary_buffer<char>>& bufs) {
            return do_for_each(bufs, [this] (temporary_buffer<char>& buf) {
                return put(buf.share());
            });
        });
    }
};
...
class http_chunked_data_sink_impl {
    virtual future<> put(net::packet data)  override { return data_sink_impl::fallback_put(std::move(data)); }
};

?

niekbouman pushed a commit to niekbouman/seastar that referenced this issue Sep 15, 2024
Previously, calling the 'write(temporary_buffer<char>)' overload on an
output_stream<char> obtained from http::reply::write_body
would cause an abort.

This commit fixes this by providing a helper function (in the data_sink_impl
base class) named 'fallback_put', which classes that inherit from this
base class can use to implement the 'put(net::packet)' overload.
Also, this commit lets http_chunked_data_sink_impl make use of this helper.

fallback_put(net::packet) works simply by extracting the temporary_buffers
contained in the packet, and calling the 'put(temporary_buffer<char>)'
overload for each temporary_buffer.

Note that we create this 'fallback_put' function instead of defining the
virtual member 'put(net::packet)' function in the data_sink_impl class,
to avoid 'put(net::packet)' and 'put(temporary_buffer<char>)' calling
each other leading to infinite recursion.

This commit also adds a unit test to tests/unit/httpd_test.cc

Thanks to Pavel (xemul) for his help.

Fixes scylladb#1701

fix header

add test
niekbouman pushed a commit to niekbouman/seastar that referenced this issue Sep 15, 2024
Previously, calling the 'write(temporary_buffer<char>)' overload on an
output_stream<char> obtained from http::reply::write_body
would cause an abort.

This commit fixes this by providing a helper function (in the data_sink_impl
base class) named 'fallback_put', which classes that inherit from this
base class can use to implement the 'put(net::packet)' overload.
Also, this commit lets http_chunked_data_sink_impl make use of this helper.

fallback_put(net::packet) works simply by extracting the temporary_buffers
contained in the packet, and calling the 'put(temporary_buffer<char>)'
overload for each temporary_buffer.

Note that we create this 'fallback_put' function instead of defining the
virtual member 'put(net::packet)' function in the data_sink_impl class,
to avoid 'put(net::packet)' and 'put(temporary_buffer<char>)' calling
each other leading to infinite recursion.

This commit also adds a unit test to tests/unit/httpd_test.cc

Thanks to Pavel (xemul) for his help.

Fixes scylladb#1701
@niekbouman
Copy link
Contributor

    future<> put(net::packet data) override {
        return do_with(data.release(), [this] (std::vector<temporary_buffer<char>>& bufs) {
            return do_for_each(bufs, [this] (temporary_buffer<char>& buf) {
                return put(buf.share());
            });
        });
    }

Coroutines exist.

See #2436 (in coroutine style, as Avi Kivity seems to prefer that style)

niekbouman pushed a commit to niekbouman/seastar that referenced this issue Sep 16, 2024
Previously, calling the 'write(temporary_buffer<char>)' overload on an
output_stream<char> obtained from http::reply::write_body
would cause an abort.

This commit fixes this by providing a helper function (in the data_sink_impl
base class) named 'fallback_put', which classes that inherit from this
base class can use to implement the 'put(net::packet)' overload.
Also, this commit lets http_chunked_data_sink_impl and
http_content_length_data_sink_impl make use of this helper.

fallback_put(net::packet) works simply by extracting the temporary_buffers
contained in the packet, and calling the 'put(temporary_buffer<char>)'
overload for each temporary_buffer.

Note that we create this 'fallback_put' function instead of defining the
virtual member 'put(net::packet)' function in the data_sink_impl class,
to avoid 'put(net::packet)' and 'put(temporary_buffer<char>)' calling
each other leading to infinite recursion.

This commit also adds a unit test to tests/unit/httpd_test.cc

Thanks to Pavel (xemul) for his help.

Fixes scylladb#1701
@xemul xemul closed this as completed in 52eb058 Oct 7, 2024
xemul added a commit that referenced this issue Oct 7, 2024
…load' from null

Previously, calling the `write(temporary_buffer<char>)` overload on an `output_stream<char>` obtained from `http::reply::write_body` would cause an abort.

This commit fixes this by providing a helper function (in the `data_sink_impl` base class) named `fallback_put`, which classes that inherit from this base class can use to implement the `put(net::packet)` overload. Also, this commit lets `http_chunked_data_sink_impl` and `http_content_length_data_sink_impl` make use of this helper.

`fallback_put(net::packet)` works simply by extracting the temporary_buffers contained in the packet, and calling the `put(temporary_buffer<char>)` overload for each `temporary_buffer`.

Note that we create this `fallback_put` function instead of defining the virtual member `put(net::packet)` function in the `data_sink_impl` class, to avoid `put(net::packet)` and `put(temporary_buffer<char>)` calling each other leading to infinite recursion.

This commit also adds unit tests to `tests/unit/httpd_test.cc` for testing the `write` overload with `http_chunked_data_sink_impl` and `http_content_length_data_sink_impl`.

Thanks to Pavel (xemul) for his help.

Fixes #1701

Closes #2436

* https://github.com/scylladb/seastar:
  Added unit test for "http_content_length_data_sink_impl"
  iostream/http: Fix output_stream::write(temporary_buffer) overload
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants