Add report linearization by Liam-DeVoe · Pull Request #75 · Zac-HD/hypofuzz

Liam-DeVoe · 2025-04-16T20:56:38Z

Liam-DeVoe · 2025-04-21T16:33:10Z

Linearized-history view is currently the only dashboard view, option to select runs-in-last-n-days coming later
Big PR; important bits are def linearize_reports and the dataclasses in database.py

Zac-HD

looking good!

Zac-HD · 2025-04-21T18:17:27Z

+    # all of this min_size=len(uuids) etc is going to lead to terrible shrinking.
+    # But the alternative of while draw(st.booleans()) will generate too-small
+    # collections. Use `more` from hypothesis internals?


Let's decompose this: generate a worker_and_run() or something, and then have reports() draw a list of those. We can pass in a list of (db_key, nodeid) pairs for each report to sample from; and have the multi-worker reports strategy add whatever offset we need after the fact if overlap=False (noting that we probably want a smaller upper bound on timestamps).

ok if I come back to this one? I definitely want a strong strategy here, but also want to write the overlapping case first before making it smarter

Zac-HD

lgtm

Zac-HD · 2025-04-22T08:27:34Z

My bad for the hasty review, seriously, but this implementation is wrong once you have a restart - sorting by elapsed time will interleave all the runs, and then crash with assertion errors.

Having played around a bit in a branch (https://github.com/Zac-HD/hypofuzz/compare/zac/linearize), I think the solution is to change REPORTS to be a nested dictionary: nodeid -> worker_uuid -> list[Report] sorted by timestamp

With that structure, it's easy to derive the diffs just be iterative over the elements of each list
- then linearize by concatenating all the per-worker-uuid lists and sorting by typestamp, and dropping not-at-the-start replay entries.
- I actually think that computing diffs in the dashboard server kinda sucks; either we should take the space hit and put that in the database (ie denormalize a bit; our worker-identity mapping is already heavy-ish), or do it in the frontend.
we need all those individual lists anyway, since we want an option to plot them separately on the per-test pages
also, in every place we construct a Report or a Metadata loaded from the database, we need to handle 'parsing errors' due to invalid json, or missing/extra keys due to writes from an older or newer fuzzer version.
- gosh that's going to be annoying, ugh, at least it's not too many places...
- I think we just skip over those entries; I'm cautious about deleting stuff (imagine you update the fuzzer, have to roll back, and in the meantime we dropped all your metadata...) - we can do something sensible later.

proper classes for database, add more keys

7e28992

Liam-DeVoe commented Apr 16, 2025

View reviewed changes

Comment thread src/hypofuzz/hypofuzz.py Outdated

Liam-DeVoe added 5 commits April 17, 2025 20:09

drop python_version_full

acc1757

add git_hash to WorkerIdentity

509f52f

store phase instead of computing it dynamically

8d5aa30

rename worker_uuid to uuid

3f771e9

add report linearization in backend and frontend

bc1dce3

Liam-DeVoe changed the title ~~Add proper database classes and store more data~~ Add report linearization Apr 21, 2025

Liam-DeVoe added 2 commits April 21, 2025 11:41

fix tests

a8f019f

remove "note" compat, reword comments

13319bf

Zac-HD reviewed Apr 21, 2025

View reviewed changes

Liam-DeVoe added 2 commits April 21, 2025 18:05

git_hash is relative to worker test

675810d

add release notes

909c6d9

Zac-HD approved these changes Apr 21, 2025

View reviewed changes

Liam-DeVoe merged commit 48bbccd into Zac-HD:master Apr 21, 2025
13 checks passed

Liam-DeVoe deleted the db-dataclasses branch April 21, 2025 23:00

Zac-HD mentioned this pull request Apr 23, 2025

Database-centric architecture for communication, persistence, and autoscaling #3

Closed

Liam-DeVoe mentioned this pull request May 14, 2025

Improve reports strategy for tests #105

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add report linearization#75

Add report linearization#75
Liam-DeVoe merged 10 commits intoZac-HD:masterfrom
Liam-DeVoe:db-dataclasses

Liam-DeVoe commented Apr 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Liam-DeVoe commented Apr 21, 2025

Uh oh!

Zac-HD left a comment

Uh oh!

Zac-HD Apr 21, 2025

Uh oh!

Liam-DeVoe Apr 21, 2025

Uh oh!

Uh oh!

Zac-HD left a comment

Uh oh!

Uh oh!

Zac-HD commented Apr 22, 2025 •

edited by Liam-DeVoe

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Liam-DeVoe commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Liam-DeVoe commented Apr 21, 2025

Uh oh!

Zac-HD left a comment

Choose a reason for hiding this comment

Uh oh!

Zac-HD Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Liam-DeVoe Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Zac-HD left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Zac-HD commented Apr 22, 2025 • edited by Liam-DeVoe Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Liam-DeVoe commented Apr 16, 2025 •

edited

Loading

Zac-HD commented Apr 22, 2025 •

edited by Liam-DeVoe

Loading