Redirection URI isn't always correctly escaped #466

k0ral · 2021-06-05T21:39:55Z

tl;dr

This is an edge case, probably not worth fixing, but worth raising awareness about.

When this happens

When the following conditions hold:

an HTTP request is made using http-client with Network.HTTP.Client.redirectCount > 0
the HTTP response is a 3** redirection
the HTTP response has a Location header filled with a URI with unescaped reserved characters inside a URI component

Example

HTTP/1.1 302 Found
Content-Type: text/html; charset=utf-8
Location: http://example.com/path/to/file[3].html

The Location URI has unescaped square brackets outside of their reserved semantic (delimiters for an IP literal within host.)

What should happen vs what actually happens

Redirection should be followed by http-client, but it is actually not: the 3** response is returned as-is.

Why

I believe the issue is in Network.HTTP.Client.Response.getRedirectedRequest function, specifically at the following line:

let l = escapeURIString isAllowedInURI (S8.unpack l')

This will never escape reserved characters, whereas in some cases such characters should be escaped.
Indeed, reserved characters are allowed in URIs, but only in very specific situations.

For example, square brackets should be escaped if, and only if, they are used as delimiters for an IP literal within host:

http://[1080::8:800:200C:417A]/foo is valid
http://example.com/path/to/file[3].html is invalid and should instead be escaped as http://example.com/path/to/file%5B3%5D.html

When this happens, l is an invalid URI, and the subsequent parseURIReference l (at this line) returns Nothing, effectively breaking the redirection.

What to do about it (my 2 cents)

This is clearly an edge case, I don't expect many users are / will be impacted by this issue (I am entertaining the idea that I'm the first observer, but I should stop kidding myself :) ).
Fixing the issue seems tricky to me: we need a "smart" escaping function that is aware of all the intricacies of URI reserved characters -- and even then, I expect some cases would end up in a guessing game with no correct answer.
Thus, unless you can think of a simple solution, the benefits/cost ratio seems very low to me, and I would consider it perfectly fine to keep the current code as-is -- but then it might be worth mentioning this limitation somewhere in the documentation.

The text was updated successfully, but these errors were encountered:

snoyberg · 2021-06-06T02:44:42Z

I looked at the relevant parts of the spec, e.g. https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.2, and didn't find any comments about this. Do you have any spec references or demonstrations of what other libraries do here?

k0ral · 2021-06-06T10:03:37Z

I haven't checked what other libraries in other languages do -- I'll update this thread if I find the time to investigate further.

On second thought, I am starting to believe that a clean solution should involve replacing the "escape, then parse" part into "escape while parsing", since proper escaping depends on context that only a URI parser is aware of.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redirection URI isn't always correctly escaped #466

Redirection URI isn't always correctly escaped #466

k0ral commented Jun 5, 2021

snoyberg commented Jun 6, 2021

k0ral commented Jun 6, 2021

Redirection URI isn't always correctly escaped #466

Redirection URI isn't always correctly escaped #466

Comments

k0ral commented Jun 5, 2021

tl;dr

When this happens

Example

What should happen vs what actually happens

Why

What to do about it (my 2 cents)

snoyberg commented Jun 6, 2021

k0ral commented Jun 6, 2021