utf-8 values are double-coded (at least from postgresql) #55

hantusk · 2015-01-09T13:30:46Z

In a dataframe resulting from e.g. db.tables.table.all(), utf-8 values from postgresql were double-encoded (encoded as utf-8 twice).

When i later had to save my dataframe to an Excelsheet or a .csv-file, i had to do a .decode('utf-8') on all values in the dataframe, for it to be able to export after some troubleshooting.

fnielsen · 2015-03-02T18:18:47Z

I have another UTF-8 problem with MySQL and Python2.7 and db.py 0.4.0

As far as I can tell my database that I connect to is UTF-8:

mysql> SELECT default_character_set_name FROM information_schema.SCHEMATA S
    -> WHERE schema_name = "schema_name";
+----------------------------+
| default_character_set_name |
+----------------------------+
| utf8                       |
+----------------------------+

df = database.tables.table.all() get me data as str rather than Unicode. I will then do a unicode(cell, 'iso8859'), which so far seems to work. cell.decode('utf-8') as @hantusk does not work for me.

Update: I suppose that cell.decode('unicode_escape') is - in my case - better than unicode(cell, 'iso8859').

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

utf-8 values are double-coded (at least from postgresql) #55

utf-8 values are double-coded (at least from postgresql) #55

hantusk commented Jan 9, 2015

fnielsen commented Mar 2, 2015

utf-8 values are double-coded (at least from postgresql) #55

utf-8 values are double-coded (at least from postgresql) #55

Comments

hantusk commented Jan 9, 2015

fnielsen commented Mar 2, 2015