-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-118761: email.quoprimime
removing re
import
#132046
base: main
Are you sure you want to change the base?
Conversation
re
importemail.quoprimime
removing re
import
The PR: will probably drastically improve the speed as well as once |
I did not notice that the warmup needed for regex: 153.9974 ± 35.97 (103 to 1778; n=10000)
non_regex: 148.4565 ± 25.48 (125 to 991; n=10000) |
( the new _HEX_TO_CHAR cache could also be used for the # Decode if in form =AB
elif i+2 < n and line[i+1] in hexdigits and line[i+2] in hexdigits:
decoded += unquote(line[i:i+3]) |
Slightly faster |
Adding the '=' check now speeds things up:
|
As a comparison (if you compile the regex for the function + add early exit) c = re.compile("=[a-fA-F0-9]{2}", flags=re.ASCII)
def header_decode_re(s):
"""Decode a string using regex."""
s = s.replace('_', ' ') # Replace underscores with spaces
if '=' in s:
return c.sub(_unquote_match, s)
return s
|
Co-authored-by: Adam Turner <[email protected]>
@AA-Turner, what's your opinion on replacing this regex expression (even though it sometimes makes the algorithm slower)? |
Very slight improvement (mainly on the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slower and harder to maintain, so I'm -1 on this PR
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Actually I don't understand this PR. It looks like we still import |
I'm a bit lost on the current benchmarks, but the most recent comment (with
Through A |
I agree this is odd. I've been using the below (rough) script to benchmark import times, for more data points than just a single run.
|
This pull request removes the
re
module from theemail.quoprimime
, thus increasing the import speed from5676
us to3669
us (a 60% import speed increase );From
To
however, the new implementation does increase the compute time
So it is very possible that this is not worth it.
Issues:
re
uses #130167