Don't record startup time. by vext01 · Pull Request #17 · ykjit/haste

vext01 · 2026-02-11T15:09:21Z

This is a draft of the haste design discussed in-person yesterday.

In summary: because haste currently times a whole process execution, starting with spawning the interpreter, up until the process exits, we are measuring startup time, including the loader etc.

The design we outlined yesterday is: haste should not run the language-specific harness (e.g. harness.lua) directly, but instead run a wrapper script (what I'm calling the "outer harness), which scrapes a more precise process execution time (minus startup) and stores it into a temporary file path provided by haste. Haste then reads the result out of this path after the benchmark is complete.

So the call chain (for yk-benchmarks) would look like:

haste b, which calls:
- outer_harness.sh /tmp/.tmpxsc2cT lua bigloop 10 1000000, which calls:
  - lua harness.lua bigloop 10 1000000

(Note that the outer harness can be written in any language, not necessarily sh)

This draft implements this design and shows how it would work on the in-tree example/, but having implemented this, I don't find the design very ergonomic. As a user, I'd wonder why the authors made it so convoluted.

I wonder if we can come up with a better design.

I'd be tempted to have just one harness (e.g. harness.lua) and this harness accepts the path to the output file and is responsible for writing the result to it. That way we don't have this wrapper script indirection and the scraping.

The only spanner in the works is that we can't break rebench. We'd either have to find a way to make harness.lua work for both rebench and haste, or have two copies of harness.lua one for each system. The former might be possible with an optional last argument for the output file: and only haste passes it.

(Don't review the code in detail -- i haven't tidied it up much)

Any thoughts? Go ahead with this design, or try something else?

ltratt · 2026-02-11T15:10:52Z

Don't forget: you don't have to have a wrapper script! We're doing that for yk-benchmarks because it means we don't have to alter harness.lua. But we could have altered harness.lua and avoided a wrapper script.

ltratt · 2026-02-11T15:15:26Z

One thing I think we might have got wrong with haste is that it has a notion of "which thing am I running" (e.g. lua or yklua). That means we have to pass that awkward second argument (lua) in the example above. In retrospect the bench command should probably take the command you want to run (e.g.haste b yklua_wrapper.sh). In that sense, the ideal is that haste becomes dumber: it's almost just a "run things, record them in a database, show them, and understand as little as possible".

ltratt · 2026-02-11T15:17:14Z

FWIW, overall this looks fine to me, and I think we should merge it (my only minor comment is that, perhaps, I wouldn't use a wrapper in the example, as I think it might suggest to readers that you have to use a wrapper, which isn't the case).

As users of haste, we won't notice the complexity with executors and whatnot. We'll just notice that we've got less noisy data!

vext01 · 2026-02-11T15:21:24Z

Yes. This is indeed annoying. I realised early on that I would need to juggle the arguments unless we only want to allow wrappers that are run by the same interpreter as you are benchmarking, which seems limiting.

In retrospect the bench command should probably take the command you want to run (e.g.haste b yklua_wrapper.sh). In that sense, the ideal is that haste becomes dumber: it's almost just a "run things, record them in a database, show them, and understand as little as possible".

I can give this a shot.

So under that design, is yklua_wrapper.sh called once for each benchmark and each executor?

ltratt · 2026-02-11T15:22:23Z

If we're going to go down the route of more haste work, I would suggest getting rid of the concept of "executor". But, right now, I'm not sure that's worth the effort. IMHO this PR improves our benchmarking, even if it's a bit awkward. I'm inclined to merge it, and let you get back to other matters. Sound like a plan?

vext01 · 2026-02-11T15:22:32Z

We raced. So I should carry on with the design we first drafted?

vext01 · 2026-02-12T09:58:56Z

Force pushed changes required to get it through CI.

ltratt · 2026-02-12T10:00:02Z

@vext01 Dumb question. In this:

if [ $? -ne 0 ]; then
    s=$?

is s picking up the exit code of [...] or the command before [...]?

In summary: because haste currently times a whole process execution, starting with spawning the interpreter, up until the process exits, we are measuring startup time, including the loader etc. This change makes it so we only measure the actual benchmarking work. This has been somewhat (pun intended) hastily hacked in for now. We hope to be able to make the use of a wrapper script optional in the future.

vext01 · 2026-02-12T10:02:54Z

Force pushed something to make that more obviously correct.

The joys of shell scripting, eh?

ltratt · 2026-02-12T10:03:26Z

This is ready for merging now I think? If so, please undraft.

vext01 · 2026-02-12T10:03:44Z

Ready.

vext01 assigned ltratt Feb 11, 2026

vext01 force-pushed the dont-record-startup branch from b925d9e to eb7c61d Compare February 12, 2026 09:58

vext01 changed the title ~~[DRAFT] Don't record startup time.~~ Don't record startup time. Feb 12, 2026

vext01 force-pushed the dont-record-startup branch from eb7c61d to a572003 Compare February 12, 2026 10:02

vext01 marked this pull request as ready for review February 12, 2026 10:03

ltratt added this pull request to the merge queue Feb 12, 2026

Merged via the queue into ykjit:main with commit d5cec21 Feb 12, 2026
2 checks passed

vext01 deleted the dont-record-startup branch February 12, 2026 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Don't record startup time.#17

Don't record startup time.#17
ltratt merged 1 commit intoykjit:mainfrom
vext01:dont-record-startup

vext01 commented Feb 11, 2026 •

edited

Loading

Uh oh!

ltratt commented Feb 11, 2026

Uh oh!

ltratt commented Feb 11, 2026

Uh oh!

ltratt commented Feb 11, 2026

Uh oh!

vext01 commented Feb 11, 2026

Uh oh!

ltratt commented Feb 11, 2026

Uh oh!

vext01 commented Feb 11, 2026

Uh oh!

vext01 commented Feb 12, 2026

Uh oh!

ltratt commented Feb 12, 2026

Uh oh!

vext01 commented Feb 12, 2026

Uh oh!

ltratt commented Feb 12, 2026

Uh oh!

vext01 commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

vext01 commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ltratt commented Feb 11, 2026

Uh oh!

ltratt commented Feb 11, 2026

Uh oh!

ltratt commented Feb 11, 2026

Uh oh!

vext01 commented Feb 11, 2026

Uh oh!

ltratt commented Feb 11, 2026

Uh oh!

vext01 commented Feb 11, 2026

Uh oh!

vext01 commented Feb 12, 2026

Uh oh!

ltratt commented Feb 12, 2026

Uh oh!

vext01 commented Feb 12, 2026

Uh oh!

ltratt commented Feb 12, 2026

Uh oh!

vext01 commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vext01 commented Feb 11, 2026 •

edited

Loading