Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode data in response #29

Open
naro opened this issue Jun 11, 2014 · 3 comments
Open

Unicode data in response #29

naro opened this issue Jun 11, 2014 · 3 comments
Assignees

Comments

@naro
Copy link

naro commented Jun 11, 2014

I'm trying to write a test against a HTML file containing some non ascii characters. Very simple example is this file with a non-ascii dash between words Hello, World:

<!DOCTYPE html>
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
  Hello — world
</body>
</html>

Running this test case fails with UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 140: ordinal not in range(128) error message:

*** Settings ***

Library    HttpLibrary.HTTP

*** Test Cases ***

Test unicode
    Create Http Context  supl.cz  http
    GET  /page2.html
    Response Body Should Contain  Hello

Testing a page without a dash is fine (note page1.html instead of page2.html)

*** Settings ***

Library    HttpLibrary.HTTP

*** Test Cases ***

Test unicode
    Create Http Context  supl.cz  http
    GET  /page1.html
    Response Body Should Contain  Hello

Using a dash character in test script seems to be fine (the following test is expected to fail, because page1.html does not contain "Hello — world" but contains "Hello world"):

*** Settings ***

Library    HttpLibrary.HTTP

*** Test Cases ***

Test unicode
    Create Http Context  supl.cz  http
    GET  /page1.html
    Response Body Should Contain  Hello — world

It looks like we need support for decoding response body to unicode so it can be compared to unicode strings.

@peritus peritus self-assigned this Jun 11, 2014
@naro
Copy link
Author

naro commented Jun 11, 2014

It seems adding a new method
response_text_should_contain which would use exactly the same code as response_body_should_contain except checking self.response.text instead of self.response.body would solve this issue.
'body' is bytes string, 'text' is unicode string, which is what I'm looking for.

@naro
Copy link
Author

naro commented Jun 11, 2014

and also get_response_text would be helpful in that case :)

@blunttester
Copy link

If we have an unicode character in the url (e.g. é) httplibrary GET is not able to be completed either. So it is not only handling the response.body where the error occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants