Add `midenc-fuzza` differential fuzzing harness by greenhat · Pull Request #1087 · 0xMiden/compiler

greenhat · 2026-04-24T11:03:53Z

It's a PoC of a test suite where an agent is generating Rust programs for differential testing using the compiler's code coverage as feedback

bitwalker · 2026-04-24T19:07:10Z

+//! plus any helpers it needs. The harness prepends a fixed header
+//! (`#![no_std]` + `#[panic_handler]`) before writing the case as `src/lib.rs`
+//! of a generated cargo project, builds it twice — natively as a host `cdylib`
+//! and via `cargo-miden` to a MASM package — and compares outputs across


Isn't this redundant with our proptest-based tests (which have the added benefit of shrinking to find minimal repros)? Or is it meant more as a proof-of-concept, and the intended use would be for programs where proptest isn't as well suited vs a traditional fuzzer?

It's a PoC of using an agent to generate Rust programs using the compiler's code coverage as feedback. I missed a PR description.

bitwalker · 2026-04-24T19:12:35Z

+    let entry: libloading::Symbol<EntryFn> = unsafe { lib.get(b"entrypoint\0") }
+        .unwrap_or_else(|e| panic!("missing `entrypoint` in {}: {e}", dylib_path.display()));
+
+    // Proptest: 16 cases, shrinking disabled — the whole case file IS the


I'm not sure that's actually true? The primary purpose of shrinking is to find the simplest input which triggers an issue, not to find the smallest program that reproduces the issue. As far as I can tell here, the case file is just the program, not the minimal inputs.

Granted, if your only inputs to a program are just plain u32 integer values, shrinking provides less value - but it can still reduce the failing input to something like (u32::MAX, 0, 0), instead of a random set of values like (u32::MAX, 123439, 8631234), which can often highlight what input is the problem (or at least remove some randomness from the inputs that obscure things unnecessarily).

The comment is a bit off. My reason is two-fold:

The shrinking generates a lot of noise that messes up the feedback for the agent.

I want to capture the exact inputs that triggered the miscompilation. Shrunk inputs might trigger another code path (another miscompilation?).

bitwalker · 2026-04-24T19:14:11Z

+panic = "abort"
+
+[profile.dev]
+panic = "abort"


If you compile a cdylib with panic = "abort" and then dynamically-load it into the current process, and execution hits a panic, you'll crash the whole process without having a way to catch the panic (because it will abort, rather than unwind as expected).

Thank you! I hadn't thought about that.

greenhat · 2026-04-30T11:02:08Z

I'm pretty happy with how it turned out. During the tuning runs, it discovered #1093, #1094 and #1095.
The instructions on how to launch an agent are in the tests/fuzza/README.md.

bitwalker

Looks good! My only request is that we move this under tools (I want tests to contain only actual test sources, not tooling used by the test suite). We still have a pending task to clean up the organization of tests in general, but I mostly want to avoid adding new stuff under tests unless that is the only place that makes sense for it.

I'm marking this approved contingent on that move, so I don't hold things up while I'm offline

bitwalker

As an aside - can we have the test cases emitted to the tests directory under some appropriate test crate? Those don't feel like they belong under the miden-fuzza tool itself.

Ideally, miden-fuzza can be used to generate the cases, and then they would be executed by our standard test suite, without the need to use miden-fuzza for that. In other words, miden-fuzza solves the problem of identifying interesting test cases, generating them, validating whether they capture a real regression or improve test coverage, and then those generated test cases get merged into our standard test suite.

AIUI, that totally fits within the intended usage of this new tool, but let me know if I'm missing anything

greenhat · 2026-05-01T10:50:52Z

Good point! I'll split it.

Adds a new `tests/fuzza` crate that compiles each case's Rust source twice — natively as a host cdylib (dlopened via `libloading`) and via `cargo-miden` to a MASM package — and compares `entrypoint(u32, u32) -> u32` outputs over 16 random input pairs per case. Seed cases: `add`, `sub`, `xor` (green); `muladd` is `#[ignore]`d because the harness surfaced a real native/MASM divergence on `wrapping_mul` that needs a separate investigation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Each case used to repeat `#![no_std]` plus a `#[panic_handler]` and `#[alloc_error_handler]`. The harness now prepends a fixed header with `#![no_std]` + panic handler before writing the case as `src/lib.rs` of the generated cargo project, so a case contains only the `entrypoint` function (and any helpers). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adds `cargo make fuzza-cov`, which runs the midenc-fuzza tests under cargo-llvm-cov, writes a raw JSON report and an HTML browser view under `target/fuzza-coverage/`, and reduces the JSON with `tests/fuzza/cov.py` into a fuzza-oriented Markdown summary that highlights the compiler functions most worth growing coverage on (filtered to compiler crates, names demangled via `rustfilt`, boring trait impls dropped). Also adds `fuzza-cov-clean` to reset the accumulated profile data. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- `fuzza-cov-step` task: re-runs the fuzza tests under cargo-llvm-cov with `--no-clean` so the profile data accumulates across iterations, rotates the previous report JSON into `report.prev.json`, and tolerates failing cases so a divergence or compile error still produces a report. - `cov.py --prev <json>`: adds a "Delta since previous run" section to the Markdown summary, listing newly-exercised functions and functions that gained regions — the feedback signal the case-generation agent reads each iteration. - `fuzza-cov-clean`: wipes both profile data and `target/fuzza-coverage/`. - `tests/fuzza/AGENT-PROMPT.md`: copy-paste prompt template for launching a coverage-guided case-generation agent (targets a specific compiler area, stops on a plateau instead of an arbitrary % target). - Adds a `branchy` seed case exercising if/else and u32 div/rem, which on its own lifts compiler coverage from ~20% to ~27% of regions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…g note to agent prompt A coverage smoke test surfaced two issues: - `build_host_cdylib` spawned `cargo build` for the case project without clearing `CARGO_TARGET_DIR`. Under `cargo llvm-cov` the parent process has it set to `target/llvm-cov-target/`, so the host artifact landed there instead of the per-project `target/release/` we look in. Now the spawn does `.env_remove("CARGO_TARGET_DIR")`. - AGENT-PROMPT.md told the agent to "skim source at File:line" without warning about the wasm-frontend → HIR-op → emitter routing layer. The agent picked `OpEmitter::cast` for its size; Rust `as` casts route via HIR `trunc`/`zext`/`sext`, never reaching `cast`. Step 2 of the loop now spells this out and points the agent at `frontend/wasm/src/code_translator/` for chain sanity checks. Adds the `widening` case the smoke test produced; it now passes under `fuzza-cov-step` (was `#[ignore]`d due to the harness bug above). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

A second smoke run of the fuzza agent prompt produced `case_bitops.rs`, exercising u32 bitwise / shift / rotate / comparison emitter arms in `codegen/masm/src/emit/binary.rs`. It surfaces another likely native/MASM divergence (inputs (4146962468, 1369714330) trigger a MASM `eqz` assertion at cycle 92), so the test is `#[ignore]`d pending root-cause investigation; compile-side coverage from the case still counts and covered ~9 previously-untouched emitter functions in `binary.rs`. Two minor refinements to AGENT-PROMPT.md from the run: - Note that a failing case still contributes compile-side coverage so the agent doesn't think a divergence wastes the iteration. - Note that integer literals in Rust source do not generally reach `_imm` emitter variants — those require HIR-level canonicalization, not raw user code. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The previous host-build artifact lookup walked `<project>/target/release/lib<name>.{so,dylib,dll}`, which broke under some CI conditions where the artifact cargo emitted didn't end up at that exact path (e.g. CARGO_TARGET_DIR redirection from `cargo make` that the env_remove didn't fully cover, or platform/cargo path quirks). Switch `build_host_cdylib` to spawn `cargo build` with `--message-format=json-render-diagnostics`, parse the stream with `cargo_metadata::Message`, and pick up the cdylib artifact's filename directly from the build output. This is the canonical way and is robust to platform naming, target-dir overrides, and cargo internals. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…self The README documents the harness, how to run it, how to add cases, and the coverage-guided workflow — including how to fill in and use AGENT-PROMPT.md. The prompt file now contains only the prompt, with `[area]` defined in one place at the top. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The fuzza harness was missing Rust cases that exercise structured loop, switch, and nested branch paths in the control-flow pipeline. Add five focused no_std cases and wire them into the fuzza test list. Four are enabled coverage cases; the unreachable-edge case is kept ignored because it exposes a native/MASM divergence for inputs (363814857, 995348134) while still documenting the compiler path it reaches. Verified with cargo make fuzza-cov-step during iteration, then cargo make test, cargo make clippy, and cargo make format-rust.

Move the differential fuzzing harness and its 12 cases from the standalone `midenc-fuzza` crate into `tests/integration/src/end_to_end/differential/`, where they sit alongside the other Rust→MASM end-to-end tests. The coverage-driven agent tooling (`AGENT-PROMPT.md`, `README.md`, `cov.py`) moves to `tools/fuzza-agent/` so the test code and the agent workflow no longer share a crate. The `cargo make fuzza-cov*` tasks keep their names but now invoke the integration suite filtered by `differential`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

greenhat changed the title ~~Add midenc-fuzza differential fuzzing harness.~~ Add midenc-fuzza differential fuzzing harness Apr 24, 2026

bitwalker reviewed Apr 24, 2026

View reviewed changes

This was referenced Apr 30, 2026

native/MASM divergence in the wrapping mul and add #1093

Open

native/MASM divergence in unreachable_guard test #1094

Open

native/MASM divergence in bitops test #1095

Open

greenhat marked this pull request as ready for review April 30, 2026 10:59

greenhat requested review from bitwalker and mooori April 30, 2026 11:02

bitwalker approved these changes May 1, 2026

View reviewed changes

bitwalker reviewed May 1, 2026

View reviewed changes

greenhat marked this pull request as draft May 1, 2026 10:51

greenhat and others added 11 commits May 5, 2026 10:33

chore: clarify the reasons for disabled shrinking

87cc92b

fix: build after rebase

0ae06cc

greenhat force-pushed the mcfa-cc branch from 501a3b5 to 0ae06cc Compare May 5, 2026 07:39

greenhat marked this pull request as ready for review May 5, 2026 09:03

greenhat merged commit 59630d5 into next May 5, 2026
15 checks passed

greenhat deleted the mcfa-cc branch May 5, 2026 10:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `midenc-fuzza` differential fuzzing harness#1087

Add `midenc-fuzza` differential fuzzing harness#1087
greenhat merged 12 commits intonextfrom
mcfa-cc

greenhat commented Apr 24, 2026 •

edited

Loading

Uh oh!

bitwalker Apr 24, 2026

Uh oh!

greenhat Apr 27, 2026

Uh oh!

bitwalker Apr 24, 2026

Uh oh!

greenhat Apr 27, 2026

Uh oh!

bitwalker Apr 24, 2026

Uh oh!

greenhat Apr 27, 2026

Uh oh!

greenhat commented Apr 30, 2026

Uh oh!

bitwalker left a comment

Uh oh!

bitwalker left a comment

Uh oh!

greenhat commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

greenhat commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bitwalker Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

greenhat Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

bitwalker Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

greenhat Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

bitwalker Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

greenhat Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

greenhat commented Apr 30, 2026

Uh oh!

bitwalker left a comment

Choose a reason for hiding this comment

Uh oh!

bitwalker left a comment

Choose a reason for hiding this comment

Uh oh!

greenhat commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greenhat commented Apr 24, 2026 •

edited

Loading