Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xab in position 289: invalid start byte #8

Open
MahadevSV opened this issue Apr 26, 2019 · 1 comment

Comments

@MahadevSV
Copy link

Hi, am getting error while loading the files into dataframe.
Please advice on this.
Here is the code and error:

Read and add data from file to a list

i=0
for f in labelled_files:
data_list.append((f,label_names[label_index[i]],Path(f).read_text()))
i += 1


UnicodeDecodeError Traceback (most recent call last)
in ()
2 i=0
3 for f in labelled_files:
----> 4 data_list.append((f,label_names[label_index[i]],Path(f).read_text()))
5 i += 1

~/anaconda3/lib/python3.6/pathlib.py in read_text(self, encoding, errors)
1195 """
1196 with self.open(mode='r', encoding=encoding, errors=errors) as f:
-> 1197 return f.read()
1198
1199 def write_bytes(self, data):

~/anaconda3/lib/python3.6/codecs.py in decode(self, input, final)
319 # decode input (taking the buffer into account)
320 data = self.buffer + input
--> 321 (result, consumed) = self._buffer_decode(data, self.errors, final)
322 # keep undecoded input until the next call
323 self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xab in position 289: invalid start byte

@gocen
Copy link

gocen commented Jan 3, 2020

this worked for me
data_list.append((f,label_names[label_index[i]],Path(f).read_text(encoding='cp1252')))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants