speedup POE::Filter::Line by reducing usage of backtracking regex#19
Open
apocalypse wants to merge 4 commits intorcaputo:masterfrom
Open
speedup POE::Filter::Line by reducing usage of backtracking regex#19apocalypse wants to merge 4 commits intorcaputo:masterfrom
apocalypse wants to merge 4 commits intorcaputo:masterfrom
Conversation
Owner
|
I'll have to look at this in more depth, probably next week. Something clever might be possible with substr() and pos() after the match you've added, to eliminate the substitution entirely. |
Collaborator
Author
|
I thought about this more and tweaked it a bit :) The issue here is that I can't entirely avoid the regex because it might be needed when using InputLiteral or InputRegexp another than the default. I think this collection of patches is the best I can do, please let me know if you have more feedback! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hello,
I was playing with the SSLify testsuite in an attempt to improve it and stumbled on something totally
unexpected! The t/simple_parallel_superbig.t test sends a ~3MB packet via Server/Client::TCP and Filter::Line. I rushed out SSLify with that test but I did notice that it took a while to run (~180s on my beefy box!) and it made me feel that something was fishy.
So I brought out the big guns and ran NYTProf on the test.
apoc@box:~/eclipse_ws/perl-poe-sslify$ PERL_MM_USE_DEFAULT=1 AUTOMATED_TESTING=1 NYTPROF=blocks=1:calls=2 perl -d:NYTProf t/simple_superbig.t
Here's the NYTProf output:
http://htmlpreview.github.io/?https://github.com/apocalypse/perl-poe-sslify/blob/speedup_sslify/nytprof_LINE/index.html
Thanks to NYTProf, the slowdown was somewhere in POE::Filter::Line. I immediately jumped to the
conclusion that POE::Filter::Line was not really suited for my purposes and decided to re-run the test with
POE::Filter::Block utilizing BlockSize => length($str). The results was surprising!
http://htmlpreview.github.io/?https://github.com/apocalypse/perl-poe-sslify/blob/speedup_sslify/nytprof_BLOCK/index.html
For starters, the runtime went from 183s to 3s in my crude testing! The difference was just too great
for me to let Filter::Line lose to Filter::Block! I dug deeper into the source for Filter::Line and based
on the NYTProf diagnostics it was THIS line taking up 180s out of the total runtime of 183s which was crazy. Line 155 is:
unless $self->[FRAMING_BUFFER] =~ s/^(.*?)$self->[INPUT_REGEXP]//s;
You'll note that it's a super-backtracking regex that when confronted with a 3MB string will take
exponential time trying to match the INPUT_REGEXP! I remember hitting weird regex bugs in the past which looked like this and then the answer dawned on me. The solution is to simply avoid doing the costly backtracking unless necessary. Here's the oneliner patch :)
Insert before lines 154/155
last LINE unless $self->[FRAMING_BUFFER] =~ $self->[INPUT_REGEXP];
The runtime went from 183s to 11.5s which is a speed-up of almost 16x! It's not reaching the 3s that
Filter::Block was able to achieve, but I doubt I can speed it up more without calling in the regex wizards :)
http://htmlpreview.github.io/?https://github.com/apocalypse/perl-poe-sslify/blob/speedup_sslify/nytprof_LINE_FIXED/index.html
With grep.cpan.me I can see ~130 dists using POE::Filter::Line so I hope this will have a meaningful
impact for some of them. Thanks again for looking into this :)
http://grep.cpan.me/?q=POE%3A%3AFilter%3A%3ALine