Skip to content

fix OOMs in Merkle/SMT deserialization#820

Merged
bobbinth merged 6 commits intonextfrom
huitseeker/fix/fuzz-deser-alloc
Feb 14, 2026
Merged

fix OOMs in Merkle/SMT deserialization#820
bobbinth merged 6 commits intonextfrom
huitseeker/fix/fuzz-deser-alloc

Conversation

@huitseeker
Copy link
Contributor

@huitseeker huitseeker commented Feb 7, 2026

This fix OOMs from untrusted length prefixes, found through fuzzing.

The scheduled fuzz jobs miden-crypto (merkle) and miden-crypto (smt_serde) fail with allocation-size-too-big / OOM when random inputs encode absurd length prefixes. Both deserializers preallocate a Vec based on the prefix (Vec::with_capacity), which can request terabytes and abort. After those fixes, fuzzing exposes a pre-existing panic in PartialMerkleTree::with_leaves on empty input (unwrap() on None).

This is technically breaking since we now return an empty tree on depth 0 (instead of panicking).

The fix, how to verify
  • prevents allocation attacks in Merkle/SMT deserializers by using read_many_iter and accurate
    min_serialized_size() bounds
  • fixes PartialMerkleTree::with_leaves panic on empty input (return an empty tree instead)
  • add budgeted-deserialization coverage for empty SMT/PMT/leaf and oversized PMT length

How to verify

  • Run the fuzz targets that failed in CI:
    • cargo +nightly fuzz run merkle -- -max_total_time=60 -runs=10000
    • cargo +nightly fuzz run smt_serde -- -max_total_time=60 -runs=10000
  • Run targeted tests:
    • cargo test -p miden-crypto merkle::partial_mt
    • cargo test -p miden-crypto merkle::smt
    • cargo test -p miden-crypto merkle::smt::full::tests::test_empty_smt_deserialization_with_budget

@huitseeker huitseeker marked this pull request as ready for review February 7, 2026 13:39
Copy link
Collaborator

@krushimir krushimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an issue regarding deserialization of PartialMerkleTree due to NodeIndex alignment padding.

The tuple doesn't override min_serialized_size, so it falls back to the in-memory size including alignment padding.

For (NodeIndex, Word):

  • in-memory: 48 bytes (NodeIndex has 7 bytes of padding between its u8 and u64 fields)
  • serialized: 41 bytes (9 + 32, no padding)

The budget check divides remaining bytes by 48 instead of 41, rejecting valid input. This test fails on this branch:

#[test]
fn deserialize_nonempty_with_budget() {
    let mt = MerkleTree::new(VALUES8).unwrap();
    let ms = MerkleStore::from(&mt);
    let path33 = ms.get_path(mt.root(), NODE33).unwrap();
    let pmt = PartialMerkleTree::with_paths([(3, path33.value, path33.path)]).unwrap();
    let bytes = pmt.to_bytes();
    let parsed = PartialMerkleTree::read_from_bytes_with_budget(&bytes, bytes.len()).unwrap();
    assert_eq!(pmt, parsed);
}

InvalidValue("requested 4 elements but reader can provide at most 3")

@huitseeker
Copy link
Contributor Author

@krushimir Fixed in #827

@huitseeker huitseeker requested a review from krushimir February 12, 2026 09:15
Copy link
Collaborator

@krushimir krushimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified the fix locally (cherry-picked from next) – works as expected. LGTM!

Replace Vec::with_capacity preallocations with read_many_iter in custom
Deserializable implementations to prevent OOM/capacity overflow attacks:

- PartialMerkleTree::read_from: use read_many_iter for (NodeIndex, Word)
- Smt::read_from: use read_many_iter for (Word, Word) entries
- SmtLeaf::read_from: use read_many_iter for (Word, Word) entries

Add min_serialized_size() overrides for accurate budget checking:
- NodeIndex: 9 bytes (u8 + u64)
- Word: 32 bytes (SERIALIZED_SIZE)
- PartialMerkleTree: 49 bytes (8 + 9 + 32)
- Smt: 65 bytes (1 + 32 + 32)
- SmtLeaf: 73 bytes (1 + 8 + 32 + 32)

This enables BudgetedReader to enforce tight bounds on allocation sizes
before any memory is allocated, preventing malicious inputs from claiming
billions of elements while providing only a few bytes of data.

Fixes fuzz failures:
- smt_serde: no longer OOMs on 6-byte malicious input
- merkle: allocation attacks prevented (separate panic in with_leaves
  on empty input is a pre-existing bug now exposed)
@huitseeker huitseeker force-pushed the huitseeker/fix/fuzz-deser-alloc branch from 30ef405 to 5869c1b Compare February 12, 2026 12:55
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you!

@bobbinth bobbinth merged commit 080b206 into next Feb 14, 2026
24 checks passed
@bobbinth bobbinth deleted the huitseeker/fix/fuzz-deser-alloc branch February 14, 2026 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants