Conversation
|
Don't forget: you don't have to have a wrapper script! We're doing that for yk-benchmarks because it means we don't have to alter harness.lua. But we could have altered harness.lua and avoided a wrapper script. |
|
One thing I think we might have got wrong with haste is that it has a notion of "which thing am I running" (e.g. lua or yklua). That means we have to pass that awkward second argument ( |
|
FWIW, overall this looks fine to me, and I think we should merge it (my only minor comment is that, perhaps, I wouldn't use a wrapper in the example, as I think it might suggest to readers that you have to use a wrapper, which isn't the case). As users of haste, we won't notice the complexity with |
|
Yes. This is indeed annoying. I realised early on that I would need to juggle the arguments unless we only want to allow wrappers that are run by the same interpreter as you are benchmarking, which seems limiting.
I can give this a shot. So under that design, is |
|
If we're going to go down the route of more haste work, I would suggest getting rid of the concept of "executor". But, right now, I'm not sure that's worth the effort. IMHO this PR improves our benchmarking, even if it's a bit awkward. I'm inclined to merge it, and let you get back to other matters. Sound like a plan? |
|
We raced. So I should carry on with the design we first drafted? |
b925d9e to
eb7c61d
Compare
|
Force pushed changes required to get it through CI. |
|
@vext01 Dumb question. In this: if [ $? -ne 0 ]; then
s=$?is |
In summary: because haste currently times a whole process execution, starting with spawning the interpreter, up until the process exits, we are measuring startup time, including the loader etc. This change makes it so we only measure the actual benchmarking work. This has been somewhat (pun intended) hastily hacked in for now. We hope to be able to make the use of a wrapper script optional in the future.
eb7c61d to
a572003
Compare
|
Force pushed something to make that more obviously correct. The joys of shell scripting, eh? |
|
This is ready for merging now I think? If so, please undraft. |
|
Ready. |
This is a draft of the haste design discussed in-person yesterday.
In summary: because haste currently times a whole process execution, starting with spawning the interpreter, up until the process exits, we are measuring startup time, including the loader etc.
The design we outlined yesterday is: haste should not run the language-specific harness (e.g.
harness.lua) directly, but instead run a wrapper script (what I'm calling the "outer harness), which scrapes a more precise process execution time (minus startup) and stores it into a temporary file path provided by haste. Haste then reads the result out of this path after the benchmark is complete.So the call chain (for yk-benchmarks) would look like:
haste b, which calls:outer_harness.sh /tmp/.tmpxsc2cT lua bigloop 10 1000000, which calls:lua harness.lua bigloop 10 1000000(Note that the outer harness can be written in any language, not necessarily
sh)This draft implements this design and shows how it would work on the in-tree
example/, but having implemented this, I don't find the design very ergonomic. As a user, I'd wonder why the authors made it so convoluted.I wonder if we can come up with a better design.
I'd be tempted to have just one harness (e.g.
harness.lua) and this harness accepts the path to the output file and is responsible for writing the result to it. That way we don't have this wrapper script indirection and the scraping.The only spanner in the works is that we can't break rebench. We'd either have to find a way to make
harness.luawork for bothrebenchandhaste, or have two copies ofharness.luaone for each system. The former might be possible with an optional last argument for the output file: and only haste passes it.(Don't review the code in detail -- i haven't tidied it up much)
Any thoughts? Go ahead with this design, or try something else?