New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

请问为什么txt的格式是utf-8还会出现这个问题 #98

Open

PhilrainV opened this issue Feb 18, 2020 · 2 comments

PhilrainV commented Feb 18, 2020

UnicodeDecodeError: 'gbk' codec can't decode byte 0xa8 in position 0: incomplete multibyte sequence

kathy98443 commented Apr 7, 2020

你是处理file时出现的吗，整体code是什么

fanrongqitiancai commented Feb 5, 2022

我这里也出现这个问题
代码如下：
import thulac
import codecs

thu1 = thulac.thulac()
thu1.cut_f("input.txt", "output.txt")
print('end')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment