Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding high-precision microbenchmarks for small sequences #148

Closed
ldionne opened this issue Aug 24, 2016 · 7 comments
Closed

Consider adding high-precision microbenchmarks for small sequences #148

ldionne opened this issue Aug 24, 2016 · 7 comments
Labels

Comments

@ldionne
Copy link
Owner

ldionne commented Aug 24, 2016

Forking #124 to discuss high precision microbenchmarks for small inputs more precisely. The idea is that we'd like to have a precise view of the behavior of algorithms on small sequences, because this is how they are mostly (but not only) used in the wild. I see two different approaches:

  1. For each sequence length N, generate K different sequences of length N and call the algorithm once on each sequence. By having a sufficiently large K, the total compilation time is increased and the relative error is reduced.
  2. Same as (1), but then divide the result by K to find an approximation of the absolute time taken for a single algorithm. I have reservations with this approach, because the compilation time as we increase K (the number of small sequences) is not necessarily linear.
@brunocodutra
Copy link
Collaborator

brunocodutra commented Aug 24, 2016

Would K be constant for all N for a given algorithm? If so, I'd be ok with the first option. If not I think we would have to stick with the second one, because otherwise charts would be non-intuitive to the viewer.

Edit: Just to clarify, since the overhead already decreases for larger N so that a smaller K would be enough, having K decrease with the increase of N would have the benefit of keeping build times reasonably short. This is quite obvious when you look at benchmarks that take over a second for N > 100, where the noise basically disappears even on travis.

@ldionne
Copy link
Owner Author

ldionne commented Aug 24, 2016

Would K be constant for all N for a given algorithm?

We can choose to go either way. I agree that decreasing K as N increases would help keep the build times short and having a large K for large Ns would not be useful. However, I'm unclear on how we would specify which K to associate to which N.

@brunocodutra
Copy link
Collaborator

I'm unclear on how we would specify which K to associate to which N

I believe choosing K so that K*N = <some constant> should be good enough.

We can choose to go either way.

So let us take a step at a time. Since the first option is basically a special case of the second one, why don't we implement 1. and then, if necessary, refine it into 2.?

@ldionne
Copy link
Owner Author

ldionne commented Aug 25, 2016

So let us take a step at a time. Since the first option is basically a special case of the second one, why don't we implement 1. and then, if necessary, refine it into 2.?

Sounds good. That raises (at least) one question: do we want to tweak the existing benchmarks so that they have higher precision for smaller sequences, or do we want to create a new set of benchmarks for "small sequences"? The first option would obviously be better if that can be done, but that might end up taking too much Travis time, depending on how smart we can be.

@brunocodutra
Copy link
Collaborator

brunocodutra commented Aug 25, 2016

do we want to tweak the existing benchmarks so that they have higher precision for smaller sequences, or do we want to create a new set of benchmarks for "small sequences"?

I'd say a bit of both, reduce the maximum size of sequences, so that we don't abuse travis too much, and make sure all measurements are precise enough, so that results don't vary that much across runs. As for a number, @ericniebler suggested we focused on < 100 elements. I have no strong opinion about this, but I think it's a reasonable suggestion.

I'm just not very comfortable with the idea of fine tuning K for each algorithm heuristically until results "look good", but I'm afraid we have no better way?

@ldionne
Copy link
Owner Author

ldionne commented Aug 26, 2016

I don't see a better way either.

@ldionne
Copy link
Owner Author

ldionne commented Oct 23, 2016

This has been resolved by #150.

@ldionne ldionne closed this as completed Oct 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants