Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed headers handling #799

Merged
merged 1 commit into from
Jan 17, 2025
Merged

Changed headers handling #799

merged 1 commit into from
Jan 17, 2025

Conversation

TiARETiK
Copy link

TLDR: While using Crow v1.1 our team have encountered a problem, where using a big amount of heders in OPTION request response caused those to be parsed incorrectly by a number of user-end systems, including Curl. After spending quite some time on debugging, we still couldn't fully understand the reasons for such a behavior, yet we managed to produce a fix, which we humbly ask to accept to the main Crow branch.

So. We are using Crow for backend of a web app. We've migrated to version 1.1 and encountered an unexpected and quite problematic behavior, where GnomeWeb web browser mishandled login form. After some debugging, we established the following facts :

  1. GnomeWeb and Curl couldn't correctly parse responses
  2. Curl request:
curl 'http://localhost:8080/auth' -X 'OPTIONS' -H 'Accept: application/json, text/plain, */*' -H 'Content-Type: application/json' -H 'Origin: http://some.site' -H 'Referer: http://some.site/' -H 'Sec-Fetch-Dest: empty' -H 'Sec-Fetch-Mode: cors' -H 'Sec-Fetch-Site: cross-site' -H 'User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15' --data-binary '{"login":"admin","password":"1234"}' -v

Curl output

*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> OPTIONS /auth HTTP/1.1
> Host: localhost:8080
> Accept: application/json, text/plain, */*
> Content-Type: application/json
> Origin: http://some.site
> Referer: http://some.site/
> Sec-Fetch-Dest: empty
> Sec-Fetch-Mode: cors
> Sec-Fetch-Site: cross-site
> User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15
> Content-Length: 35
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 204 No Content
< Access-Control-Allow-Headers: *
< Access-Control-Allow-Methods: *
< Access-Control-Allow-Origin: *
* Connection #0 to host localhost left intact

Expected Curl output

*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> OPTIONS /auth HTTP/1.1
> Host: localhost:8080
> Accept: application/json, text/plain, */*
> Content-Type: application/json
> Origin: http://ukazka.softcom.su
> Referer: http://ukazka.softcom.su/
> Sec-Fetch-Dest: empty
> Sec-Fetch-Mode: cors
> Sec-Fetch-Site: cross-site
> User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15
> Content-Length: 35
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 204 No Content
< Access-Control-Allow-Headers: *
< Access-Control-Allow-Methods: *
< Access-Control-Allow-Origin: *
< Access-Control-Expose-Headers: X-Total-Pages, X-Total-Entries
< Allow: OPTIONS, HEAD, POST
< Content-Length: 0
< Server: Crow/master
< Date: Fri, 19 Apr 2024 09:25:41 GMT
< 
* Connection #0 to host localhost left intact
  1. Some apps (specifically, Mozilla Firefox web browser) still handled given responses correctly.

After some work we figured out, that not all of the headers were parsed. Also, exchange of headers positions in response resulted in the first ones being parsed correctly, while later ones were ignored. Thus we concluded, that problem was not connected to some specific header being sent incorrectly, but to the structure of response in general. We couldn't quite figure out the underlying reason for such behavior, yet the proposed change seams to fix our problem, and we believe that it can be useful to others for as long as the proper solution is not implemented.

@bugdea1er
Copy link
Contributor

@gittiver could you look into this please? I have the same problem and this solves it. Although this doesn't seem like the right solution.

@luca-schlecker
Copy link
Collaborator

I agree with @bugdea1er that this doesn't seem like the correct solution.

I am quite short on time but I'd be willing to invest a little bit of it.

Could you please provide a simple example program producing this error so I can try and investigate?

@paulharris
Copy link
Contributor

paulharris commented Oct 16, 2024

I looked into this, I believe your PR highlights the weirdness that needs to be fixed.
Your patch "fixes" the problem,
so I added some code to break it again, to make the problem VERY reliable - good for testing.

Build the headers_ string as per your patch,
but don't add that complete header blob into buffers_
Instead, add that header blob one byte at a time.
Each of these asio::const_buffers is just a pointer+size to some data.
The asio::write functions should simply write all the buffers to the socket.
But, when you break up the memory blob into many buffers, there is a window of time for things to break.

   // buffers_.emplace_back(headers_.data(), headers_.size());
   for (int i = 0; i < headers_.size(); ++i)
            buffers_.emplace_back(headers_.data()+i, 1);

The code then continues on to do_write(), which calls asio::async_write().
The async write function never finishes writing all its buffers before it appears to be interrupted by further reads which set an error state.

The problem is when there is no route rule.
It ends up at

parser_.done();

This is reentering into the http parser(!) which I guess is probably not intended.

Backtrace (note lines are a little different due to my hacks and because I'm using 1.2.0)

  * frame #0: 0x00005555556d69ad server_wfsync`crow::http_parser_execute(parser=0x00007fffe40035b0, settings=0x0000555555b0d000, data=0x0000000000000000, len=0) at http_parser_merged.h:730:9
    frame #1: 0x0000555555724ae9 server_wfsync`crow::HTTPParser<crow::Connection<crow::SocketAdaptor, crow::Crow<crow::CORSHandler>, crow::CORSHandler>>::feed(this=0x00007fffe40035b0, buffer=0x0000000000000000, length=0) at parser.h:126:27
    frame #2: 0x0000555555724b4b server_wfsync`crow::HTTPParser<crow::Connection<crow::SocketAdaptor, crow::Crow<crow::CORSHandler>, crow::CORSHandler>>::done(this=0x00007fffe40035b0) at parser.h:136:20
    frame #3: 0x0000555555725136 server_wfsync`crow::Connection<crow::SocketAdaptor, crow::Crow<crow::CORSHandler>, crow::CORSHandler>::handle_url(this=0x00007fffe4002540) at http_connection.h:118:25
    frame #4: 0x00005555557250ac server_wfsync`crow::HTTPParser<crow::Connection<crow::SocketAdaptor, crow::Crow<crow::CORSHandler>, crow::CORSHandler>>::process_url(this=0x00007fffe40035b0) at parser.h:152:23
    frame #5: 0x0000555555724d49 server_wfsync`crow::HTTPParser<crow::Connection<crow::SocketAdaptor, crow::Crow<crow::CORSHandler>, crow::CORSHandler>>::on_url(self_=0x00007fffe40035b0, at="/auth2 HTTP/1.1\r\nHost: localhost:8080\r\nAccept: application/json, text/plain, */*\r\nContent-Type: application/json\r\nOrigin: http://some.site\r\nReferer: http://some.site/\r\nSec-Fetch-Dest: empty\r\nSec-Fetch-Mode: cors\r\nSec-Fetch-Site: cross-site\r\nUser-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15\r\nContent-Length: 35\r\n\r\n{\"login\":\"admin\",\"password\":\"1234\"}", length=6) at parser.h:37:19
    frame #6: 0x00005555556d78fb server_wfsync`crow::http_parser_execute(parser=0x00007fffe40035b0, settings=0x0000555555b0d000, data="GET /auth2 HTTP/1.1\r\nHost: localhost:8080\r\nAccept: application/json, text/plain, */*\r\nContent-Type: application/json\r\nOrigin: http://some.site\r\nReferer: http://some.site/\r\nSec-Fetch-Dest: empty\r\nSec-Fetch-Mode: cors\r\nSec-Fetch-Site: cross-site\r\nUser-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15\r\nContent-Length: 35\r\n\r\n{\"login\":\"admin\",\"password\":\"1234\"}", len=425) at http_parser_merged.h:923:13
    frame #7: 0x0000555555724ae9 server_wfsync`crow::HTTPParser<crow::Connection<crow::SocketAdaptor, crow::Crow<crow::CORSHandler>, crow::CORSHandler>>::feed(this=0x00007fffe40035b0, buffer="GET /auth2 HTTP/1.1\r\nHost: localhost:8080\r\nAccept: application/json, text/plain, */*\r\nContent-Type: application/json\r\nOrigin: http://some.site\r\nReferer: http://some.site/\r\nSec-Fetch-Dest: empty\r\nSec-Fetch-Mode: cors\r\nSec-Fetch-Site: cross-site\r\nUser-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 Safari/605.1.15\r\nContent-Length: 35\r\n\r\n{\"login\":\"admin\",\"password\":\"1234\"}", length=425) at parser.h:126:27
    frame #8: 0x000055555572483a server_wfsync`crow::Connection<crow::SocketAdaptor, crow::Crow<crow::CORSHandler>, crow::CORSHandler>::do_read(this=0x00007ffff33fefa0, ec=0x00007ffff33fefb0, bytes_transferred=425)::'lambda'(std::error_code const&, unsigned long)::operator()(std::error_code const&, unsigned long) const at http_connection.h:552:48
    frame #9: 0x0000555555724780 server_wfsync`asio::detail::binder2<crow::Connection<crow::SocketAdaptor, crow::Crow<crow::CORSHandler>, crow::CORSHandler>::do_read()::'lambda'(std::error_code const&, unsigned long), std::error_code, unsigned long>::operator()(this=0x00007ffff33fefa0) at bind_handler.hpp:180:5

Summary, in order of call depth ... its a little hard for me to lay out nicely:

  • do_read()
  • calls feed() THE FIRST CALL
  • calls http_parser_execute() ... parser->state = s_req_http_start
  • calls on_url()
  • calls process_url()
  • calls handle_url() ... no rule_index, so "return response without parsing further"
    ** This first calls parser_.done(), which calls feed() REENTRY !!
    *** feed() then calls http_parser_execute( empty buffer, zero length ), which is a surprise to the parser and it sets ERRNO = CHPE_INVALID_EOF_STATE, at:
    CROW_SET_ERRNO(CHPE_INVALID_EOF_STATE);

    ** Back to the body of handle_url(), which calls complete_request()
    *** This eventually builds our mysterious headers and calls asio::async_write(), which schedules the send on another thread... but it will never finish the transmission in time ...
  • Now we finally unwind back to the FIRST feed() function body, where the FIRST http_parser_execute() has returned a non-ok http_errno, so feed() returns false
  • back into the do_read() function body, ret==false, so error_while_reading==true. We come to
    if (error_while_reading)
    and it cancels all of the transmission before the response can be sent.

So the problem appears to stem from the path taken when we want to send a response BEFORE finishing the read. It isn't an error during reading, we really just want to stop reading any further, but wait for the async_write to complete. I guess? I'm not sure on the best fix.

In the meantime, I "fixed" the problem for myself by DISABLING the async write inside do_write_general(), ie the line:

if (res.body.length() < res_stream_threshold_)

becomes: if (false && res.body.length() < res_stream_threshold_)

Then the write is able to complete before it tried to process the read any further.

You still get errors in the DEBUG printouts, such as:
(2024-10-16 16:10:08) [DEBUG ] 0x7fffe4000fa0 from read(1) with description: "stream ended at an unexpected time"

But at least the client receives the full transmission.

My client code was simple:

#include <crow.h>
#include <crow/middlewares/cors.h>
int main() {
   crow::App< crow::CORSHandler > app;
   CROW_ROUTE(app, "/auth") ([]() { return crow::response("Auth"); });
   app.loglevel(crow::LogLevel::DEBUG).port(8080).run();
   return 0;
}

Note that the CORSHandler is NOT the problem, it is simply a way to add some headers that you can see will not be sent back to the client.

My hack to http_connection.h, as per the commit in this PR,
PLUS comment out the last line in PR and do this:

// buffers_.emplace_back(headers_.data(), headers_.size());
for (int i = 0; i < headers_.size(); ++i)
   buffers_.emplace_back(headers_.data()+i, 1);

Command line (failure - not enough headers returned):

$ curl 'http://127.0.0.1:8080/auth' -X 'OPTIONS' -v
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> OPTIONS /auth HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.5.0
> Accept: */*
> 
< HTTP/1.1 204 No Content
* Connection #0 to host 127.0.0.1 left intact

Then, change do_write_general() ... the first line becomes:
if (false && res.body.length() < res_stream_threshold_)

Command line (success):

$ curl 'http://127.0.0.1:8080/auth' -X 'OPTIONS' -v
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> OPTIONS /auth HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.5.0
> Accept: */*
> 
< HTTP/1.1 204 No Content
< Access-Control-Allow-Headers: *
< Access-Control-Allow-Methods: *
< Access-Control-Allow-Origin: *
< Allow: OPTIONS, HEAD, GET
< Content-Length: 0
< Server: Crow/master
< Date: Wed, 16 Oct 2024 16:19:55 GMT
< 
* Connection #0 to host 127.0.0.1 left intact

@The-EDev
Copy link
Member

The-EDev commented Oct 25, 2024

based on @paulharris's research, it seems that the problem primarily stems from using asio::async_write, whereas the alternative route the code takes when adding false to the if statement ultimately leads to asio::write.

IIRC I created do_write_sync() as a way to work around the do_write() method without modifying it. if not using async at all solves this problem, then maybe the best solution would be to remove do_write() and only use do_write_sync() for everything.

@The-EDev The-EDev added the discussion The viability / implementation of the issue is up for debate label Oct 25, 2024
@paulharris
Copy link
Contributor

paulharris commented Oct 26, 2024 via email

@The-EDev
Copy link
Member

thinking about it, the most obvious outcome is that until a response is finished writing, crow won't be able to handle a new request... except this is already the case (if I remember the tests I ran 3 years ago correctly).
So Crow doesn't respond asynchronously, and threads already get tied up... so I don't think there will be much of a difference.

on a side note, we seriously need to get #258 done.. we need some way to have the server shelf a handler and execute another connection, then go back to that handler once it's ready. Unfortunately I don't believe my experience in C++ or async programming in enough for that.. Maybe there's a library that already does this kind of think and we can plug it into crow's connection handling... I would really appreciate any input here..

@bugdea1er
Copy link
Contributor

@gittiver @The-EDev so can we merge this? This really fixes the problem, we can ask @TiARETiK to write a test or two and then fix it properly in master

@gittiver
Copy link
Member

gittiver commented Jan 9, 2025

@gittiver @The-EDev so can we merge this? This really fixes the problem, we can ask @TiARETiK to write a test or two and then fix it properly in master

It seems to be a problem with writing the buffers asynchronously, so it would only "hide" the problem.

I agree that we should write some tests where Crow shows this behavior.

@TiARETiK TiARETiK marked this pull request as draft January 9, 2025 12:15
@TiARETiK
Copy link
Author

TiARETiK commented Jan 9, 2025

@The-EDev , the header-writhing has been changed to synchronous, and a test has been created that fails on the current master and passes on this branch. I would greatly appreciate if this pull request would be merged.

@TiARETiK TiARETiK marked this pull request as ready for review January 9, 2025 15:46
@TiARETiK
Copy link
Author

TiARETiK commented Jan 9, 2025

@gittiver , could you look into this request, please?

@gittiver
Copy link
Member

A synchronous write has an impact on performance, so I will not simply takeover that.
Thanks for the test, that will helps to identify solutions.

@TiARETiK
Copy link
Author

@gittiver , I mean... There is in fact a provable bug that causes incorrect behavior. And it's kinda better for something to work, albeit slow, than not to work, even if it does so extremely fast. Might you reconsider? Or could you at least suggest some other direction for an attempted solution?

@gittiver
Copy link
Member

There is a workaround using synchronous write, which you can use.

I am looking into the issue in parallel if there are other solutions.
The behavior and the calls to http-parser are at least weird, I want to have a deeper look before I do something which eventually hides wrong behavior.

@gittiver
Copy link
Member

The http connection code is a mess. :(
the http_parser is unsupported yet code for reading and reading and writing is distributed between the classes and so on.

I will for now merge this pull request to fix the bug.

But this needs further work.

@gittiver gittiver merged commit feb71ef into CrowCpp:master Jan 17, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion The viability / implementation of the issue is up for debate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants