Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utf-8 values are double-coded (at least from postgresql) #55

Open
hantusk opened this issue Jan 9, 2015 · 1 comment
Open

utf-8 values are double-coded (at least from postgresql) #55

hantusk opened this issue Jan 9, 2015 · 1 comment

Comments

@hantusk
Copy link

hantusk commented Jan 9, 2015

In a dataframe resulting from e.g. db.tables.table.all(), utf-8 values from postgresql were double-encoded (encoded as utf-8 twice).

When i later had to save my dataframe to an Excelsheet or a .csv-file, i had to do a .decode('utf-8') on all values in the dataframe, for it to be able to export after some troubleshooting.

@fnielsen
Copy link

fnielsen commented Mar 2, 2015

I have another UTF-8 problem with MySQL and Python2.7 and db.py 0.4.0

As far as I can tell my database that I connect to is UTF-8:

mysql> SELECT default_character_set_name FROM information_schema.SCHEMATA S
    -> WHERE schema_name = "schema_name";
+----------------------------+
| default_character_set_name |
+----------------------------+
| utf8                       |
+----------------------------+

df = database.tables.table.all() get me data as str rather than Unicode. I will then do a unicode(cell, 'iso8859'), which so far seems to work. cell.decode('utf-8') as @hantusk does not work for me.

Update: I suppose that cell.decode('unicode_escape') is - in my case - better than unicode(cell, 'iso8859').

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants