Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

u32-table row count grew a lot with 0.42-alpha 6 #312

Open
Sword-Smith opened this issue Aug 1, 2024 · 1 comment
Open

u32-table row count grew a lot with 0.42-alpha 6 #312

Sword-Smith opened this issue Aug 1, 2024 · 1 comment
Labels
🕵 investigation This design change might improve the VM 🟢 prio: low Not at all urgent ⏪ slowdown Makes stuff go slower.

Comments

@Sword-Smith
Copy link
Collaborator

Sword-Smith commented Aug 1, 2024

The u32 table row count has grown a lot with the latest release, v0.42-alpha6. The new u32 check in merkle_step seems to be the culprit.

Here's the diff for the benchmark result of the tasm-lib snippet tasmlib_hashing_merkle_verify. Notice that the u32-table row count is increased by a factor of 10 in the "worst-case" benchmark where a leaf in a Merkle tree of height 20 is verified. It seems to me that we're paying full cost for each u32 check in merkle_step, every time this instruction is executed. I though, because of the look-up nature of the u32 table, that only the 1st iteration of a merkle_step recurse_or_return loop would pay a price in terms of u32-table rows.

Also in tasm-lib's

     "benchmark_result": {
       "clock_cycle_count": 43,
       "hash_table_height": 72,
-      "u32_table_height": 12,
+      "u32_table_height": 45,
       "op_stack_table_height": 26,
       "ram_table_height": 0
     },
    "case": "WorstCase"
@@ -15,7 +15,7 @@
     "benchmark_result": {
       "clock_cycle_count": 71,
       "hash_table_height": 156,
-      "u32_table_height": 28,
+      "u32_table_height": 278,
       "op_stack_table_height": 26,
       "ram_table_height": 0
     },
     "case": "WorstCase"

The u32-table row count is now so high that it's not too far off being the bottleneck in the TASM implementation of the verifier. We have:

Processor Opstack RAM Hash u32
EF4, alpha-5 299.477 246.104 282.946 236.275 76.396
EF4, alpha-6 285.185 235.126 281.599 234.025 211.091

So the u32-table row count almost tripled with alpha-6, because of the u32 check in each iteration of merkle_step. Still not the bottleneck, but I don't think this performance was what we expected when adding this check.

Sword-Smith added a commit to TritonVM/tasm-lib that referenced this issue Aug 1, 2024
This upstream upgrade results in a huge (3x) increase in the u32 table
row count in the STARK-verifier. See
TritonVM/triton-vm#312.
@jan-ferdinand
Copy link
Member

The underlying issue is an oversight of mine when adding a range check for the node index to instruction merkle_step. Each execution of merkle_step adds an (or updates the existing) entry to the U32 Table. Crucially, walking two different paths even for the same Merkle tree generates a new U32 Table entry for every distinct node in the union of both paths.

A potential way to address this is to allow U32 Table lookups of intermediate results. Crucially, not all instructions can get this new power, as they might have invalid intermediate results (like lt).

@jan-ferdinand jan-ferdinand added ⏪ slowdown Makes stuff go slower. 🕵 investigation This design change might improve the VM 🟢 prio: low Not at all urgent labels Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🕵 investigation This design change might improve the VM 🟢 prio: low Not at all urgent ⏪ slowdown Makes stuff go slower.
Projects
None yet
Development

No branches or pull requests

2 participants