Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError #81

Open
gforcada opened this issue Apr 12, 2021 · 6 comments
Open

UnicodeDecodeError #81

gforcada opened this issue Apr 12, 2021 · 6 comments

Comments

@gforcada
Copy link
Member

BUG/PROBLEM REPORT (OR OTHER COMMON ISSUE)

I see, by the hundreds, on our production servers tracebacks like the following:

Traceback (most recent call last):
  File "/nix/store/4jgk3fpv1z54lv5dbw2g3747j8ncvq9r-python-2.7.14/lib/python2.7/logging/handlers.py", line 76, in emit
    if self.shouldRollover(record):
  File "/nix/store/4jgk3fpv1z54lv5dbw2g3747j8ncvq9r-python-2.7.14/lib/python2.7/logging/handlers.py", line 156, in shouldRollover
    msg = "%s\n" % self.format(record)
  File "/nix/store/4jgk3fpv1z54lv5dbw2g3747j8ncvq9r-python-2.7.14/lib/python2.7/logging/__init__.py", line 734, in format
    return fmt.format(record)
  File "/srv/s-derfreitag/.batou-shared-eggs/ZConfig-3.5.0-py2.7.egg/ZConfig/components/logger/formatter.py", line 342, in format
    s = self.formatMessage(record)
  File "/srv/s-derfreitag/.batou-shared-eggs/ZConfig-3.5.0-py2.7.egg/ZConfig/components/logger/formatter.py", line 323, in formatMessage
    raise e
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 160: ordinal not in range(128)
Logged from file AccessLogger.py, line 34

As seen, we are using python 2.7 still 😨 and ZConfig 3.5.0.

As that is on formatting logging, it is difficult to log anything around it to see what's the problem 🤷

I will make an internal patch, deploy it on a canary instance and in a few days report back my findings 👍

@dataflake
Copy link
Member

As far as I can see the error is not in ZConfig. Something is sent to your log that is not ASCII and the logger blows up on that.

@gforcada
Copy link
Member Author

Yes, indeed, ZConfig could try a bit harder to handle the encoding/decoding problem, so that at least one can see what actually gets there... I'm looking at it, no success so far :)

@jamadden
Copy link
Member

Based on the line numbers, ZConfig has already tried decoding twice:

def formatMessage(self, record):
try:
s = self._style.format(record)
except UnicodeDecodeError as e:
# Issue 25664. The logger name may be Unicode. Try again ...
try:
record.name = record.name.decode('utf-8')
s = self._style.format(record)
except UnicodeDecodeError:
raise e
return s

@dataflake
Copy link
Member

I'd start with finding out about the message that causes the issue. You could add some simple print statements before the raise e to see the record.

@gforcada
Copy link
Member Author

gforcada commented Apr 13, 2021

Finally I could get something out of it (and yes, sorry for the short signed me not seeing that ZConfig tires twice to handle UnicodeDecodeError's 😅 ):

Turns out that the problem is on ZServer/medusa/http_server while trying to log a request to the access log file. So random/broken URLs? 🤷

With this: derFreitag@0d4ef60 is enough to fix it. Would it be worth a pull request? 🤔

@jamadden
Copy link
Member

I'm not personally a fan of trying to decode the message. That seems to me like a downstream bug, where the caller of the library is confused about what is text and what is bytes (and what is a native string).

But I can concede that it might be a pragmatic thing to do, and it only affects the Python 2 formatter, where such things are admittedly easier to confuse. (However, watch out for issues trying to use that code on Python 3.)

So I guess my vote would be -0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants