Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redirection URI isn't always correctly escaped #466

Open
k0ral opened this issue Jun 5, 2021 · 2 comments
Open

Redirection URI isn't always correctly escaped #466

k0ral opened this issue Jun 5, 2021 · 2 comments

Comments

@k0ral
Copy link

k0ral commented Jun 5, 2021

tl;dr

This is an edge case, probably not worth fixing, but worth raising awareness about.

When this happens

When the following conditions hold:

Example

HTTP/1.1 302 Found
Content-Type: text/html; charset=utf-8
Location: http://example.com/path/to/file[3].html

The Location URI has unescaped square brackets outside of their reserved semantic (delimiters for an IP literal within host.)

What should happen vs what actually happens

Redirection should be followed by http-client, but it is actually not: the 3** response is returned as-is.

Why

I believe the issue is in Network.HTTP.Client.Response.getRedirectedRequest function, specifically at the following line:

let l = escapeURIString isAllowedInURI (S8.unpack l')

This will never escape reserved characters, whereas in some cases such characters should be escaped.
Indeed, reserved characters are allowed in URIs, but only in very specific situations.

For example, square brackets should be escaped if, and only if, they are used as delimiters for an IP literal within host:

  • http://[1080::8:800:200C:417A]/foo is valid
  • http://example.com/path/to/file[3].html is invalid and should instead be escaped as http://example.com/path/to/file%5B3%5D.html

When this happens, l is an invalid URI, and the subsequent parseURIReference l (at this line) returns Nothing, effectively breaking the redirection.

What to do about it (my 2 cents)

This is clearly an edge case, I don't expect many users are / will be impacted by this issue (I am entertaining the idea that I'm the first observer, but I should stop kidding myself :) ).
Fixing the issue seems tricky to me: we need a "smart" escaping function that is aware of all the intricacies of URI reserved characters -- and even then, I expect some cases would end up in a guessing game with no correct answer.
Thus, unless you can think of a simple solution, the benefits/cost ratio seems very low to me, and I would consider it perfectly fine to keep the current code as-is -- but then it might be worth mentioning this limitation somewhere in the documentation.

@snoyberg
Copy link
Owner

snoyberg commented Jun 6, 2021

I looked at the relevant parts of the spec, e.g. https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.2, and didn't find any comments about this. Do you have any spec references or demonstrations of what other libraries do here?

@k0ral
Copy link
Author

k0ral commented Jun 6, 2021

I haven't checked what other libraries in other languages do -- I'll update this thread if I find the time to investigate further.

On second thought, I am starting to believe that a clean solution should involve replacing the "escape, then parse" part into "escape while parsing", since proper escaping depends on context that only a URI parser is aware of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants