Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid collapsing slashes that follow http or https #9

Open
robolson opened this issue Jun 10, 2013 · 3 comments
Open

Avoid collapsing slashes that follow http or https #9

robolson opened this issue Jun 10, 2013 · 3 comments

Comments

@robolson
Copy link

urlnorm is breaking a legitimate use of two successive slashes in our URLs.

Here is an example of what is happening:

http://example.com/login?redirect-back-to=http://example.com/homepage

urlnorm'd is http://example.com/login?redirect-back-to=http:/example.com/homepage

@jehiah would you be open to adding a special case to not collapse two successive slashes that follow http: or https:? I am willing to put in a pull request for this but I wanted to check if it would be accepted first.

Ideally urlnorm would support all of the official URI schemas, not just http and https but since the list is large I think just checking http and https would satisfy 90% of people.

@jehiah
Copy link
Owner

jehiah commented Jun 11, 2013

Hi @robolson Thanks for writing up these details.

I would expect double slashes to be collapsed if they are in the path section of a URL, but not in the query string. I did a quick test (below) and it seems ok. Can you verify if you were seeing this happen in the path or query string or elsewhere?

>>> urlnorm.norm('http://example.com/login?redirect-back-to=http://example.com/homepage')
u'http://example.com/login?redirect-back-to=http://example.com/homepage'
>>> urlnorm.__version__
'1.1.2'

@robolson
Copy link
Author

Ah! I'm sorry @jehiah. I brainstormed a simple example without testing it and and your right, urlnorm, does the right thing when the double slashes are in the query string.

In my real life case, the slashes are part of the path section of the URL. The URLs are intended for consumption by the Thumbor project which has architected the URLs to be created in this fashion.

Here is an example URL for Thumbor that is being sent through urlnorm:

>>> urlnorm.norm('http://thumbor-server.com/K97LekICOXT9MbO3X1u8BBkrjbu5/300x200/http://example.com/images/white-sunglasses.jpg')
u'http://thumbor-server.com/K97LekICOXT9MbO3X1u8BBkrjbu5/300x200/http:/example.com/images/white-sunglasses.jpg'

@jehiah
Copy link
Owner

jehiah commented Jun 11, 2013

gotcha. We are both on the same page now.

I'm totally not opposed to special casing this one exclusion to collapsing a double '//' in the path. Patch away.

My only implementation request is to include a test that has a comment w/ a decent one line explanation of why that's a common situation which is special cased. Then ping me when you have something to review/merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants