Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd7 in position 27673: invalid continuation byte #3129

Open
sentry-io bot opened this issue Jul 25, 2022 · 0 comments
Labels
bug Issue type priority-low Priority hint

Comments

@sentry-io
Copy link

sentry-io bot commented Jul 25, 2022

Sentry Issue: PERMA-B

This has been caught 92 times already including 36 times in the last 24 hours. Evidently, it's not uncommon for robots.txt not to be utf-8 encoded: let's see if we can be smarter about this, even if it doesn't cause entire captures to fail.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd7 in position 27673: invalid continuation byte
  File "threading.py", line 892, in run
    self._target(*self._args, **self._kwargs)
  File "perma/tasks.py", line 613, in robots_txt_thread
    content = str(robots_txt_response.content, 'utf-8')
@matteocargnelutti matteocargnelutti added bug Issue type priority-low Priority hint labels Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue type priority-low Priority hint
Projects
None yet
Development

No branches or pull requests

1 participant