Zq/fix transpose #180

chonpsk · 2025-07-29T06:31:18Z

No description provided.

…ilerCollection into zf/zkcuda_optim

…ields

gemini-code-assist

Summary of Changes

Hello @chonpsk, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces significant improvements to the ZK-CUDA compiler's core functionality, particularly in how it handles tensor shape transformations and memory management. It also adds a new proving system optimized for resource allocation and refactors the testing infrastructure for better integration and validation. The changes aim to enhance the robustness, efficiency, and usability of the ZK-CUDA proving pipeline.

Highlights

Shape Manipulation Improvements: The core logic for handling and transposing tensor shapes within the ZK-CUDA compiler has been refined. This includes updates to Entry::transpose_shape, Entry::minimize, and prefix_products_to_shape functions, which are crucial for correct data transformations in cryptographic circuits.
API Signature Change for Device Memories: The prove method across various proving systems (e.g., dummy, expander, parallelized, pcs_defered) now takes ownership of device_memories (Vec<Vec<SIMDField<C>>>) instead of a reference (&[Vec<SIMDField<C>>]). This is a significant change impacting memory management and ownership patterns within the proving pipeline.
New Proving System: ExpanderNoOverSubscribe: A new proving system, ExpanderNoOverSubscribe, has been introduced. This system is designed to manage parallelization without oversubscribing CPU resources, potentially offering optimized performance or stability for specific use cases. It comes with its own API, profiling capabilities, and server-side implementation.
Dependency Updates and Additions: Numerous dependencies have been updated to newer versions, and new crates like stacker and psm have been added. These additions likely support the new memory management and parallelization strategies introduced with the ExpanderNoOverSubscribe system.
Enhanced ZK-CUDA Integration Testing: A comprehensive suite of new binaries (zkcuda_setup, zkcuda_prove, zkcuda_verify, zkcuda_cleanup) and an accompanying run.sh script have been added. These provide a more structured and automated workflow for setting up, proving, and verifying ZK-CUDA circuits, improving the overall testing and development experience.
Unified KZG Configuration: The KZG Polynomial Commitment Scheme configuration has been standardized to BN254ConfigSha2UniKZG across various parts of the codebase, replacing older BN254ConfigSha2KZG references.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new "no oversubscribe" mode and fixes an issue with transpose operations. The changes are extensive, touching many parts of the codebase to simplify generics, improve API design, and add new functionality. I've identified a few areas for improvement, including a potential correctness issue in a utility function, some opportunities to improve performance, and minor code style suggestions.

gemini-code-assist · 2025-07-29T06:35:30Z

expander_compiler/src/zkcuda/shape.rs

+    let mut shape = products.to_vec();
    for i in 1..products.len() {
-        shape.push(products[i] / products[i - 1]);
+        shape[i] /= products[i - 1];
    }
    shape
 }


The implementation of prefix_products_to_shape has changed. The new version returns a shape with a leading 1, which is different from the previous implementation and likely incorrect. For example, for products = [1, 2, 6, 24], it will now return [1, 2, 3, 4] instead of the expected [2, 3, 4]. This could lead to incorrect shape calculations. If the intention is to revert to the previous correct behavior, you can use the old implementation:

let mut shape = Vec::with_capacity(products.len() - 1); for i in 1..products.len() { shape.push(products[i] / products[i - 1]); } shape

Suggested change

let mut shape = products.to_vec();

for i in 1..products.len() {

shape.push(products[i] / products[i - 1]);

shape[i] /= products[i - 1];

}

shape

}

let mut shape = Vec::with_capacity(products.len() - 1);

for i in 1..products.len() {

shape.push(products[i] / products[i - 1]);

}

shape

gemini-code-assist · 2025-07-29T06:35:30Z

expander_compiler/src/zkcuda/proving_system/expander_pcs_defered/prove_impl.rs

+    let (mut commitment, state) =
+        local_commit_impl::<C, ECCConfig>(prover_setup.p_keys.get(&len_to_commit).unwrap(), vals);

    commitment.vals_len = actual_len; // Store the actual length in the commitment
    (commitment, state)
 }


The new max_len_setup_commit_impl passes vals directly to local_commit_impl without padding. If the underlying PCS commit function requires the input polynomial to have a length that matches the setup key's length, this could lead to incorrect commitments or panics. Please verify that the PCS implementation correctly handles committing to a polynomial that is shorter than what the proving key was generated for. The previous implementation with padding seemed safer in this regard.

gemini-code-assist · 2025-07-29T06:35:31Z

Cargo.toml

+arith = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+mpi_config = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+gkr_field_config = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+babybear = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+crosslayer_prototype = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+expander_circuit = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging", package = "circuit" }
+expander_transcript = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging", package = "transcript" }
+expander_binary = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging", package = "bin" }
+gkr = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+gf2 = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+mersenne31 = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+goldilocks = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+poly_commit = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging", package = "poly_commit" }
+polynomials = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+sumcheck = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+serdes = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging", package = "serdes" }
+gkr_engine = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+gkr_hashers = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging" }
+expander_utils = { git = "https://github.com/PolyhedraZK/Expander", branch = "zf/optimize_pcs_claim_merging", package = "utils" }


The dependencies from PolyhedraZK/Expander are pointing to the zf/optimize_pcs_claim_merging branch. While this is fine for development, it's generally better to merge the dependency changes into the main branch and reference that, or use a specific commit hash for reproducibility. Relying on a feature branch can make the build fragile if the branch is deleted or rebased. Consider changing the dependency to point to a specific commit hash or merging the feature branch first. For example:

arith = { git = "https://github.com/PolyhedraZK/Expander", rev = "300aa3d" }

gemini-code-assist · 2025-07-29T06:35:31Z

expander_compiler/src/zkcuda/proving_system/expander_parallelized/cmd_utils.rs

    let cmd_str = format!(
-        "mpiexec -n {max_parallel_count} {overscribe} {binary} --field-type {field_name} --poly-commit {pcs_name} --port-number {port_number}"
+        "mpiexec -n {max_parallel_count} {overscribe} {binary} --field-type {field_name} --poly-commit {pcs_name} --port-number {port_number} {batch_pcs_option}"
    );


The current method of constructing the command string can result in an extra space at the end if batch_pcs is false. While most shells handle this, it's safer to build the command string more robustly to avoid potential parsing issues.

let cmd_str = format( "mpiexec -n {max_parallel_count} {overscribe} {binary} --field-type {field_name} --poly-commit {pcs_name} --port-number {port_number}{}", if batch_pcs { " --batch-pcs" } else { "" } );

gemini-code-assist · 2025-07-29T06:35:31Z

expander_compiler/src/zkcuda/proving_system/expander_parallelized/api_parallel.rs

+            .iter()
+            .zip(computation_graph.proof_templates().iter())


The verification loop has been changed from using par_iter() to iter(), which makes the verification of proofs for different kernels sequential. If there are many kernels and their verification is computationally intensive, this could be a significant performance regression. Was this change intentional? If not, please consider reverting to par_iter() to leverage parallel execution.

Suggested change

.iter()

.zip(computation_graph.proof_templates().iter())

.par_iter()

.zip(computation_graph.proof_templates().par_iter())

gemini-code-assist · 2025-07-29T06:35:31Z

expander_compiler/src/zkcuda/proving_system/expander_parallelized/api_parallel.rs

            .proofs
-            .par_iter()
-            .zip(computation_graph.proof_templates().par_iter())
+            .iter()
+            .zip(computation_graph.proof_templates().iter())


The verification loop was changed from using par_iter() to iter(). This makes the verification of proofs for different kernels sequential. This could be a performance regression. Was this change intentional? If not, please consider reverting to par_iter() to leverage parallel execution.

Suggested change

.proofs

.par_iter()

.zip(computation_graph.proof_templates().par_iter())

.iter()

.zip(computation_graph.proof_templates().iter())

.par_iter()

.zip(computation_graph.proof_templates().par_iter())

zhiyong1997 added 30 commits June 25, 2025 20:06

add some timer

ae040c1

some more tests

aa92f5e

timer in verifier

ed47ac3

switch to uni-kzg in plain parallelized expander

fc2e657

minor

46e8f54

switch to uni-kzg by default

7ef3472

clippy auto fix

710c6cf

fix test

5c23c69

switch to original kzg in non-pcs-batching

b56a743

wip

0bfc049

remove unnecessay config in circuit preprocess

b3aea1a

fine-grained config to enable flexibility

2734bb9

minor bug fix

bb8fdd5

clippy

cddcf1f

remove pcs field

6cad241

no overscribe version done

a9b755a

clippy auto fix

da3c1e1

add testing code

4ed35e1

bug fix mpi_config -> local_mpi_config

5e0a7ea

bug fix in server binary of no oversubscribe

4cc1d45

Merge branch 'master' into zf/zkcuda_optim

c07607b

fix inconsistency after merging

36442cb

clippy auto fix

3a8daa5

benchmark code

1233863

change config in benchmark code

e6d6c74

update: KZG -> UniKZG

6c595c7

add binary in ci

4056954

add zkcuda config

9bb5270

switch to defered PCS

e4c78a6

clippy fix

714e573

zhiyong1997 and others added 28 commits July 15, 2025 21:19

clippy auto fix

9aab4d3

printing the size of computation graph/witness/proof

3e72945

update dependency

f838643

fix a bug in server cli

3cd0ba8

reorganize tests

39b112e

zkcuda integration

9c529fe

testing script

1323f88

Merge branch 'zf/zkcuda_optim' of github.com:PolyhedraZK/ExpanderComp…

c05f82e

…ilerCollection into zf/zkcuda_optim

simple renaming

61eaece

client drop witness after serialization

1a4e7a9

bug fix

ccb3988

bug fix

59370d5

a simple profiler to detect the number of bytes used to store bn254 f…

fe9b62f

…ields

bug fix

e28e367

make compiler happy about unused import

fdda64b

minor

7a1abc7

including intermediate evaluations

58770c3

clearer profiler

ea30303

clippy

f07ab90

bug fix

f8df18b

remove par verifier

fcd1ede

clippy auto fix

3a44693

no need to padding for unikzg

372d39b

clippy

9d3eb94

update

5e2b6f3

try to fix undo_transpose_shape_products

d37fd53

try to fix

5310426

clear

aa80f74

gemini-code-assist bot reviewed Jul 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Zq/fix transpose #180

Zq/fix transpose #180

Uh oh!

chonpsk commented Jul 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jul 29, 2025

Uh oh!

gemini-code-assist bot Jul 29, 2025

Uh oh!

gemini-code-assist bot Jul 29, 2025

Uh oh!

gemini-code-assist bot Jul 29, 2025

Uh oh!

gemini-code-assist bot Jul 29, 2025

Uh oh!

gemini-code-assist bot Jul 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Zq/fix transpose #180

Are you sure you want to change the base?

Zq/fix transpose #180

Uh oh!

Conversation

chonpsk commented Jul 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants