Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port cx_cz_lnn to rust #13867

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

gadial
Copy link
Contributor

@gadial gadial commented Feb 18, 2025

Summary

Ports the synth_cx_cz_depth_line_my method to Rust.

Closes #12231

Details and comments

This is synthesis pass for constructing circuits given by two matrices, representing CX and CZ circuits. The code generating the instruction list was ported to Rust in a straightforward manner; construction of the circuit itself is done in Python.

A quick benchmarking was performed by using random_invertible_binary_matrix(n, seed=seed) of the CX matrix and np.triu(np.random.randint(0, 2, (n, n)), k=1) for the CZ matrix, and timing the application of synth_cx_cz_depth_line_my on the result, showing consistent improvement for the rust version.

╒═════════════════════╤═════════════╤═════════════╤═══════════════════════╕
│ Test Case           │      python │        rust │   Ratio (python/rust) │
╞═════════════════════╪═════════════╪═════════════╪═══════════════════════╡
│ 5x5 (seed 42)       │  0.0032567  │ 5.85556e-05 │               55.6173 │
├─────────────────────┼─────────────┼─────────────┼───────────────────────┤
│ 10x10 (seed 42)     │  0.00927253 │ 8.74519e-05 │              106.03   │
├─────────────────────┼─────────────┼─────────────┼───────────────────────┤
│ 20x20 (seed 55)     │  0.0447065  │ 0.000269413 │              165.94   │
├─────────────────────┼─────────────┼─────────────┼───────────────────────┤
│ 50x50 (seed 13)     │  0.472343   │ 0.00473766  │               99.6997 │
├─────────────────────┼─────────────┼─────────────┼───────────────────────┤
│ 60x60 (seed 55)     │  0.825494   │ 0.00885715  │               93.2008 │
├─────────────────────┼─────────────┼─────────────┼───────────────────────┤
│ 70x70 (seed 13)     │  1.19549    │ 0.0150931   │               79.2078 │
├─────────────────────┼─────────────┼─────────────┼───────────────────────┤
│ 100x100 (seed 1089) │  3.91824    │ 0.0512372   │               76.4727 │
├─────────────────────┼─────────────┼─────────────┼───────────────────────┤
│ 120x120 (seed 17)   │  7.31543    │ 0.100896    │               72.5043 │
├─────────────────────┼─────────────┼─────────────┼───────────────────────┤
│ 150x150 (seed 99)   │ 16.1747     │ 0.234127    │               69.0851 │
╘═════════════════════╧═════════════╧═════════════╧═══════════════════════╛

@gadial gadial requested a review from a team as a code owner February 18, 2025 08:12
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

cir.push(CircuitInstructions::CX(w1 as u32, w2 as u32));
}
for i in 0..n {
match phase_schedule[[n - 1 - i, n - 1 - i]] {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now I missed the % 4 here and it wasn't caught by testing; I'll extend the tests to ensure this is caught.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this cannot have any real effect since the values in the phase schedule are always computed modulo 4 so the code is technically correct, although I'll add the % 4 since it's better to be on the safe side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could also use u4 for representing phases, and use wrapping_add to perform modular arithmetic. But I don't think it's worth the hassle.

@coveralls
Copy link

coveralls commented Feb 18, 2025

Pull Request Test Coverage Report for Build 13569108809

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 180 of 183 (98.36%) changed or added relevant lines in 4 files are covered.
  • 2065 unchanged lines in 121 files lost coverage.
  • Overall coverage decreased (-0.3%) to 87.873%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/synthesis/linear/lnn.rs 4 5 80.0%
crates/accelerate/src/synthesis/linear_phase/cx_cz_depth_lnn.rs 170 172 98.84%
Files with Coverage Reduction New Missed Lines %
crates/accelerate/src/basis/basis_translator/basis_search.rs 1 99.31%
crates/qasm2/src/expr.rs 1 94.23%
qiskit/circuit/controlflow/switch_case.py 1 95.35%
qiskit/circuit/library/standard_gates/r.py 1 97.62%
qiskit/circuit/library/standard_gates/rxx.py 1 97.78%
qiskit/circuit/library/standard_gates/rzx.py 1 97.78%
qiskit/circuit/library/standard_gates/u2.py 1 96.88%
qiskit/circuit/random/utils.py 1 94.87%
qiskit/circuit/store.py 1 96.77%
qiskit/dagcircuit/dagdependency.py 1 89.66%
Totals Coverage Status
Change from base Build 13376164595: -0.3%
Covered Lines: 77618
Relevant Lines: 88330

💛 - Coveralls

@gadial gadial added this to the 2.0.0 milestone Feb 18, 2025
@gadial gadial added performance synthesis Rust This PR or issue is related to Rust code in the repository labels Feb 18, 2025
@gadial gadial self-assigned this Feb 18, 2025
@ShellyGarion ShellyGarion self-assigned this Feb 19, 2025
@ShellyGarion
Copy link
Member

Could you perhaps also compare the time improvement of the function synth_clifford_depth_lnn now that all of the internal functions (synth_cnot_depth_line_kms, synth_cz_depth_line_mr, synth_cx_cz_depth_line_my) were ported to rust? (compare qiskit 1.2 vs. qiskit 2.0).

def synth_clifford_depth_lnn(cliff):

@gadial
Copy link
Contributor Author

gadial commented Feb 20, 2025

Could you perhaps also compare the time improvement of the function synth_clifford_depth_lnn now that all of the internal functions (synth_cnot_depth_line_kms, synth_cz_depth_line_mr, synth_cx_cz_depth_line_my) were ported to rust? (compare qiskit 1.2 vs. qiskit 2.0).

def synth_clifford_depth_lnn(cliff):

When running on random cliffords, I'm getting a nice improvement but I suspect there's more we can do.

╒══════════════════════╤════════════╤════════════╤═══════════════════════╕
│ Test Case            │     python │       rust │   Ratio (python/rust) │
╞══════════════════════╪════════════╪════════════╪═══════════════════════╡
│ 2 qubits (seed 42)   │ 0.00163498 │ 0.00252638 │              0.647163 │
├──────────────────────┼────────────┼────────────┼───────────────────────┤
│ 5 qubits (seed 42)   │ 0.00278735 │ 0.00405846 │              0.686801 │
├──────────────────────┼────────────┼────────────┼───────────────────────┤
│ 10 qubits (seed 42)  │ 0.00726461 │ 0.00962257 │              0.754955 │
├──────────────────────┼────────────┼────────────┼───────────────────────┤
│ 20 qubits (seed 42)  │ 0.029776   │ 0.0239356  │              1.24401  │
├──────────────────────┼────────────┼────────────┼───────────────────────┤
│ 40 qubits (seed 42)  │ 0.14392    │ 0.0663365  │              2.16955  │
├──────────────────────┼────────────┼────────────┼───────────────────────┤
│ 80 qubits (seed 42)  │ 0.940864   │ 0.265452   │              3.54439  │
├──────────────────────┼────────────┼────────────┼───────────────────────┤
│ 100 qubits (seed 42) │ 1.98206    │ 0.446445   │              4.43966  │
├──────────────────────┼────────────┼────────────┼───────────────────────┤
│ 150 qubits (seed 42) │ 8.59436    │ 1.27764    │              6.72675  │
╘══════════════════════╧════════════╧════════════╧═══════════════════════╛

@gadial
Copy link
Contributor Author

gadial commented Feb 20, 2025

Doing quick profiling of 1.2 vs 2.0:

1.2:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   25.077   25.077 clifford_decompose_layers.py:416(synth_clifford_depth_lnn)
        1    0.000    0.000   25.077   25.077 clifford_decompose_layers.py:67(synth_clifford_layers)
        1    0.007    0.007   12.975   12.975 clifford_decompose_layers.py:248(_decompose_graph_state)
        1    3.998    3.998   12.813   12.813 cz_depth_lnn.py:121(synth_cz_depth_line_mr)
        1    0.002    0.002    8.967    8.967 clifford_decompose_layers.py:311(_decompose_hadamard_free)
        1    0.009    0.009    8.917    8.917 cx_cz_depth_lnn.py:218(synth_cx_cz_depth_line_my)
    18772    8.034    0.000    8.034    0.000 {method 'count' of 'list' objects}
        1    2.971    2.971    3.983    3.983 cx_cz_depth_lnn.py:119(_update_phase_schedule)
        1    0.000    0.000    3.421    3.421 linear_depth_lnn.py:210(_optimize_cx_circ_depth_5n_line)
        1    0.053    0.053    3.354    3.354 linear_depth_lnn.py:113(_matrix_to_north_west)
    49238    3.225    0.000    3.242    0.000 linear_depth_lnn.py:89(_in_linear_combination)
     13/6    0.000    0.000    2.801    0.467 clifford.py:147(__init__)
        1    0.000    0.000    2.801    2.801 clifford.py:670(from_circuit)
      7/1    0.232    0.033    2.800    2.800 clifford_circuits.py:25(_append_circuit)
 152282/6    0.226    0.000    2.800    0.467 clifford_circuits.py:53(_append_operation)
   152424    0.474    0.000    1.888    0.000 quantumcircuit.py:2405(_append_standard_gate)
   119340    0.062    0.000    1.595    0.000 quantumcircuit.py:5303(cx)
   119340    0.909    0.000    1.548    0.000 clifford_circuits.py:401(_append_cx)
        1    0.041    0.041    0.864    0.864 cx_cz_depth_lnn.py:163(_apply_phase_to_nw_circuit)
  5933932    0.610    0.000    0.610    0.000 {built-in method builtins.min}

2.0:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.967    4.967 clifford_decompose_layers.py:417(synth_clifford_depth_lnn)
        1    0.000    0.000    4.967    4.967 clifford_decompose_layers.py:67(synth_clifford_layers)
     13/6    0.000    0.000    3.085    0.514 clifford.py:149(__init__)
        1    0.000    0.000    3.084    3.084 clifford.py:672(from_circuit)
      7/1    0.243    0.035    3.084    3.084 clifford_circuits.py:25(_append_circuit)
 152254/6    0.244    0.000    3.083    0.514 clifford_circuits.py:53(_append_operation)
   119172    1.010    0.000    1.735    0.000 clifford_circuits.py:401(_append_cx)
        1    0.007    0.007    0.797    0.797 clifford_decompose_layers.py:249(_decompose_graph_state)
        1    0.000    0.000    0.792    0.792 clifford_decompose_layers.py:312(_decompose_hadamard_free)
        1    0.000    0.000    0.765    0.765 cx_cz_depth_lnn.py:36(synth_cx_cz_depth_line_my)
        1    0.764    0.764    0.764    0.764 {py_synth_cx_cz_depth_line_my}
        1    0.000    0.000    0.649    0.649 cz_depth_lnn.py:33(synth_cz_depth_line_mr)
        1    0.648    0.648    0.648    0.648 {synth_cz_depth_line_mr}
   279257    0.200    0.000    0.462    0.000 clifford.py:285(z)
   850215    0.209    0.000    0.408    0.000 base_operator.py:91(num_qubits)
   291626    0.153    0.000    0.299    0.000 clifford.py:276(x)
   152254    0.096    0.000    0.285    0.000 clifford_circuits.py:48(<listcomp>)
586413/586379    0.113    0.000    0.227    0.000 {built-in method builtins.isinstance}
   850215    0.199    0.000    0.199    0.000 op_shape.py:105(num_qubits)
   272820    0.132    0.000    0.189    0.000 quantumcircuit.py:3127(find_bit)

It seems the bottleneck now is handling the large number of append_operation applications.

Copy link
Contributor

@alexanderivrii alexanderivrii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very-very cool PR, that ports the existing Python code to Rust using highly advanced (or at least what I consider highly advanced) Rust features such as flat_map and scan methods for iterators. The code is quite similar to the original Python code, modular rewriting it in the Rustacean way, making it almost possible and yet a bit difficult (at least for me) to guarantee the formal equivalence between the two versions just by visual inspection. Fortunately, we have quite extensive tests, and based on that, the code is correct.

My main comments is to include a short summary of the performance improvement in the release notes and double-check that the function docstrings render well. Thanks for the great work!

/// (c.f. Fig.2, [2])
fn _make_seq(n: usize) -> Vec<(usize, usize)> {
(0..n)
.scan((0..n).rev().collect::<Vec<usize>>(), |wire_labels, i| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is so cool!

cir.push(CircuitInstructions::CX(w1 as u32, w2 as u32));
}
for i in 0..n {
match phase_schedule[[n - 1 - i, n - 1 - i]] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could also use u4 for representing phases, and use wrapping_add to perform modular arithmetic. But I don't think it's worth the hassle.

---
features_transpiler:
- |
The :meth:`.synth_cx_cz_depth_line_my` pass was ported into rust.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are missing a line at the end of the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you please add a short statement on the runtime improvement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in dc9bbd1, I hope that's enough

Comment on lines 168 to 179
/// Given
/// Width of the circuit (int n)
/// A CZ circuit, represented by the n*n phase schedule phase_schedule
/// A CX circuit, represented by box-labels (seq) and whether the box is SWAP+ (swap_plus)
/// * This circuit corresponds to the CX tranformation that tranforms a matrix to
/// a NW matrix (c.f. Prop.7.4, [1])
/// * SWAP+ is defined in section 3.A. of [2].
/// * As previously noted, the northwest diagonalization procedure of [1] consists
/// of exactly n layers of boxes, each being either a SWAP or a SWAP+. That is,
/// each northwest diagonalization circuit can be uniquely represented by which
/// of its n(n-1)/2 boxes are SWAP+ and which are SWAP.
/// Return a QuantumCircuit that computes the phase schedule S inside CX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in other places: I am not completely sure whether this style of iterative indentation renders well, and in addition I think I remember from somewhere that using - is preferred over using * for lists. Bottom line: could you please double-check how these Rust doc comments look like, and if they look well, nothing needs to be done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you can also use backticks (``n``) to highlight n and other variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in dc9bbd1. I used cargo doc --document-private-items to verify the new version looks ok (you were correct and the original version looked bad).

@ShellyGarion
Copy link
Member

Doing quick profiling of 1.2 vs 2.0:
...
It seems the bottleneck now is handling the large number of append_operation applications.

Indeed, it seems that except of the 3 synthesis functions (synth_cnot_depth_line_kms, synth_cz_depth_line_mr, synth_cx_cz_depth_line_my) most of the code of synth_clifford_depth_lnn is performing Clifford tableaux operations.
Perhaps it will be worth sometime to port it to rust, see: #12367
@alexanderivrii has already implemented some useful Clifford utils here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Rust This PR or issue is related to Rust code in the repository synthesis
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port synth_cx_cz_depth_line_my to Rust
5 participants