Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added try/catch into unescapeuri() #17

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

GregorMatheis
Copy link

In the current version unescapeuri() gives an error if the %... character is an invalid base 16. However some url parameters can have %... as part of their value.
For those cases the function now gives a warning but returns the string without changes and doesn't throw an error.

In the current version unescapeuri() gives an error if the %... character is an invalid base 16. However some url parameters can have %... as part of their value. 
For those cases the function now gives a warning but returns the string without changes and doesn't throw an error.
@c42f
Copy link
Collaborator

c42f commented Mar 10, 2021

Hi, thanks for the PR!

Can you include one or several test cases to show how this is supposed to work?

It's very useful to tie all URI parsing and escaping back to one of the RFC standards, or to a library with similar functionality so that we know we're doing something sensible. Do you have a reference for the desired behavior? https://tools.ietf.org/html/rfc3986 is the primary RFC for URI syntax. It can be a little hard to interpret though, so referring to how other software libraries implement these things can also be useful.

I don't think these low level parsers should throw warnings - that should be up to higher level code in the user's application. From the point of view of URIs.jl I think we should try to be definitive about whether URIs are malformed or not.

  • If malformed, we should throw an error by default, and perhaps have a "permissive" flag to allow more permissive parsing of malformed URIs.
  • If not malformed, then this is a bug in our code which we should just fix!

@GregorMatheis
Copy link
Author

Hi @c42f

first, I'm super sorry for the late response, somehow I missed the notification that you answered.
My idea originated in a real life issue with data coming from Google Analytics that includes the full URL. For some reasons there are '%' in the URI that aren't valid base 16 character (and additionally last character '%'). Not sure if that is an issue that more people are facing.

Anyways I'm happy to not throw a warning and have something like a flag. I'll edit that in the code so you can have a look.

As mentioned in discussion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants