Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple encoding formats apart from UTF-8 #8

Open
timonoortman-aas opened this issue Feb 9, 2021 · 0 comments · May be fixed by #10
Open

Allow multiple encoding formats apart from UTF-8 #8

timonoortman-aas opened this issue Feb 9, 2021 · 0 comments · May be fixed by #10

Comments

@timonoortman-aas
Copy link

timonoortman-aas commented Feb 9, 2021

Currently the algorithm only seems to accept SSIM files in UTF-8 encoding. In case of UTF-16, the following error is being raised:
image

With a small change, more formats should be accepted. E.g. by replacing

    with open(file, "r") as f:
        text = f.read()

in read(), by

import chardet
bytes = min(32, os.path.getsize(file))
raw = open(file, 'rb').read(bytes)
result = chardet.detect(raw)
encoding = result['encoding']
    with open(file, "r", ecoding=encoding) as f:
        text = f.read()
@timonoortman-aas timonoortman-aas linked a pull request Feb 25, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants