yardstiq / quantum-benchmarks Goto Github PK

View Code? Open in Web Editor NEW

117.0 117.0 28.0 17.99 MB

benchmarking quantum circuit emulators for your daily research usage

License: Other

Python 1.49% Julia 0.23% Shell 0.43% Common Lisp 0.09% OpenQASM 97.44% C++ 0.31% Batchfile 0.01%

benchmark high-performance quantum quantum-computing

quantum-benchmarks's People

Contributors

Stargazers

Watchers

quantum-benchmarks's Issues

Flat scaling of qiskit

@atilag the single gate benchmark is almost flat for qiskit, but for other frameworks it seems normal (at least is scaling with the system size). Could you help me review if there is anything wrong in the benchmark or this is the correct? I'm not sure if the script is correct since the scaling is kinda strange. This benchmark is running with the master branch with the commit SHA given in README.

benchmark full naive VQE for PennyLane, Yao, tensorflow quantum

we need to add a full implementation for quantum circuit born machine that includes the AD time for PennyLane, Yao and tf quantum

pyQuil?

Just double-checking that you didn't benchmark the Forest simulator right?

Cirq shows logarithmic complexity on single-gate tasks

Hi. I've run the Benchmark (forked from commit 2492b3a) with some latest Qiskit and Cirq-0.13.1. Multi-gate tests look valid, but on single-gate tasks I see the attached picture. AFAIK the non-exponential performance in Cirq's case means that it uses some "forbidden" optimisation like erasing the unchanged qubits or alike. Does this explanation sound right and how can I find and disable this optimization?

PyQuEST-cffi mislabeled as QuEST

Hi there,

It's excellent to see initiative benchmarking the wide suite of available QC emulators!

However, it appears PyQuEST-cffi is mislabeled as "QuEST" in the plot legends.
PyQuEST-cffi is an independent project by HQS to write python bindings for the C project QuEST, on which I myself work. These python bindings carry overhead to the underlying QuEST C functions, and hence their performance can be (especially with large iteration in python) significantly worse than QuEST's, which is not here benchmarked. Is it possible to correct these legends?

Note I believe, like Yao, PyQuEST-cffi supports GPU in addition to CPU (since QuEST supports multithreading, GPU and distribution).

Thanks very much,
Tyson

qulacs QFT implementation quesiton

see #30

update benchmark

Request of qulacs update and question about precision of simulation

Hi,
I'm a contributor of qulacs.
First of all, thanks for adding our library qulacs to this nice benchmark project.
I've checked qulacs benchmark script, and confirmed it is implemented in an efficient way.

On the other hand, I have two following request/question about benchmarks.

update of qulacs package

Though our library was incompatible with the latest gcc in the previous version, now we believe "pip install qulacs" works for all the recent gcc (and it is merged to our SIMD codes). Can I ask you to try it, and replace build script for forked repository with pypi package install ("qulacs==0.1.8" in requirements.txt)?
If you will do gpu benchmarking with the same project, (qulacs-gpu==0.1.8) might be better, which enables both CPU/GPU simulation, but fails to build without CUDA.

precision of complex values

As far as I know, for example, cirq will perform simulation with complex64 by default (https://cirq.readthedocs.io/en/stable/generated/cirq.Simulator.html), but qulacs compute with complex128. Is there any regulation about precision? I think benchmarks should be done in the same precision if possible.

Thanks,

Dead link in `CONTRIBUTING.md`

It seems that the link embedding the quantum circuit born machine picture in CONTRIBUTING.md is unfortunately dead. It might be a good idea to host the file in this repo the way the benchmark plots are saved in /images.

move website to github pages

fail to build jkq-ddsim

Hi @hillmich

currently our benchmark machine fails to build jkq-ddsim due to the following error:

CMake Warning at /usr/share/cmake-3.10/Modules/FindBoost.cmake:567 (message):
  Imported targets and dependency information not available for Boost version
  (all versions older than 1.33)
Call Stack (most recent call first):
  /usr/share/cmake-3.10/Modules/FindBoost.cmake:907 (_Boost_COMPONENT_DEPENDENCIES)
  /usr/share/cmake-3.10/Modules/FindBoost.cmake:1558 (_Boost_MISSING_DEPENDENCIES)
  apps/CMakeLists.txt:3 (find_package)


CMake Error at /usr/share/cmake-3.10/Modules/FindBoost.cmake:1947 (message):
  Unable to find the requested Boost libraries.

  Unable to find the Boost header files.  Please set BOOST_ROOT to the root
  directory containing Boost or BOOST_INCLUDEDIR to the directory containing
  Boost's headers.
Call Stack (most recent call first):
  apps/CMakeLists.txt:3 (find_package)

I'm wondering if you have boost installation setup in your cmake? since I don't think we would like to have boost as the default global benchmark environment since it's a quite large dependency. I'm wondering if not, would you mind update the setup.sh to install boost for jkq-ddsim?

NOTE: also you might already notice we refactor the benchmark recently to make it more modular.

implement google quantum supremacy circuit

QCBM link broken

In CONTRIBUTING.md, the link https://quantumbfs.github.io/Yao.jl/latest/assets/figures/differentiable.png seems to be broken.

fix JKQ_DDSIM benchmark path

the benchmark data path is not correctly handled correct.

bin/plot fails on recently pushed data

Hi,

I've noticed you pushed the benchmark results in the data/ directory. However, the bin/plot script does not generate the plots. Did I miss some preprocessing steps?

Best regards and stay safe,
Stefan

$ bin/plot 
Traceback (most recent call last):
  File "bin/plot", line 9, in <module>
    labels=['X', 'H', 'T', 'CNOT', 'Toffoli']
  File "/home/stefan/repos/quantum-benchmarks/bin/utils/plot_utils.py", line 89, in parse_data
    gate_data[each_package] = wash_benchmark_data(each_package, labels)
  File "/home/stefan/repos/quantum-benchmarks/bin/utils/plot_utils.py", line 44, in wash_benchmark_data
    with open(find_json(name)) as f:
  File "/home/stefan/repos/quantum-benchmarks/bin/utils/plot_utils.py", line 34, in find_json
    for each in os.listdir(benchmark_path):
NotADirectoryError: [Errno 20] Not a directory: '/home/stefan/repos/quantum-benchmarks/data/yao.csv'

benchmark tensorflow quantum

let's add benchmarks for tensorflow quantum

re-run all benchmarks

Improving Qiskit Aer benchmarks

Hello @Roger-luo, I am a developer of Qiskit Aer and was recently shown your rather nice benchmark repo. I have some suggestions for how the qiskit benchmarks could be improved, since I feel they are under representing the simulator.

Suggestions:

When you transpile the circuit in qiskit you need to include the backend so that it compiles to the native basis gates of the simulator, otherwise it will unroll all single-qubit gates to u3 gates.
You shouldn't be using the statevector simulator for benchmarks, rather you should be using the qasm_simulator. The statevector simulator has a lot of overhead in serializing the statevector via JSON, where as the qasm simulator does not (you can still ask for a snapshot of the statevector in the qasm simulator). This overhead has been improved somewhat in our next release due to replacing JSON with Pybind11, but it still under-represents the simulator if you are interested in timing how fast it is applying gates.
The qasm simulator has numerous options for method and parallelization that you may want to explicitly configure. Eg:
- It supports multiple simulation methods (eg statevector, Clifford stabilizer, density matrix, mps) so if you want to specifically benchmark the statevector method you can specify that.
- By default it will truncate a circuit to remove all non-active qubits, so you would want to disable that optimization to benchmark gate times
- You can set the maximum OpenMP threads to use as 1 to disable parallelization.
How you report the time taken depends on what you are trying to benchmark. Aer includes a lot of overhead in its result data output. So if you are trying to profile the time of a single gate, you can get a more accurate measure of that excluding the result serialization if desired. The different ways of timing include:
- The full wall-clock time as measured by Python for calling backend.run
- The run-only time measured in Python (accessible from Result.time_taken). This excludes the time in initializing and validating the Python result object from the output python dict of the simulator
- The full C++ execution time excluding conversion to Python objects (accessible from Result.metadata['time_taken']). This excludes the C++ -> Py result conversion overhead.
- The C++ circuit-only execution time (accessible from Result.results[0].time_taken). This excludes any overhead for validation and configuration settings in the C++ simulator, and any Py -> C++ conversion.

Depending on what you are trying to show in benchmarks different timing is more important. I would argue for the Gate-level benchmarks you should show the C++ times, but for circuit level benchmarks that include results you would actually use I would show the Python time.

If you like I could put in a PR to this repo to make some of the suggested changes, but below I've included a code snippet for applying theses suggestions to a manual implementation of your X-gate benchmark:

import numpy as np
from qiskit import *
import time
import matplotlib.pyplot as plt

def native_execute(circuit, backend, backend_options):
    experiment = transpile(circuit, backend)  # Transpile to simulator basis gates
    qobj = assemble(experiment, shots=1)  # Set execution shots to 1
    start = time.time()
    result = backend.run(qobj, backend_options=backend_options).result()
    stop = time.time()
    time_py_full = stop - start  # Total execution time in python
    time_py_run = result.time_taken  # C++ measured total execution time excluding conversion of C++ results to Py results, and Py qobj to C++ qobj
    time_cpp_full = result.metadata['time_taken']  # C++ measured total execution time excluding conversion of C++ results to Py results, and Py qobj to C++ qobj
    time_cpp_expr = result.results[0].time_taken  # C++ measured execution time of a single circuit (ie. state init, apply gates, excludes other setup overhead for config options etc)
    return time_py_full, time_py_run, time_cpp_full, time_cpp_expr

def benchmark_x(qubit_range, samples, backend_options=None):
    
    backend = Aer.get_backend('qasm_simulator')

    ts_py_full = np.zeros(len(nqs))
    ts_py_run = np.zeros(len(nqs))
    ts_cpp_full = np.zeros(len(nqs))
    ts_cpp_exp = np.zeros(len(nqs))

    for i, nq in enumerate(qubit_range):
        qc = QuantumCircuit(nq)
        qc.x(0)

        t_py_full = 0
        t_py_run = 0
        t_cpp_full = 0
        t_cpp_exp = 0

        for _ in range(samples):
            t0, t1, t2, t3  = native_execute(qc, backend, backend_options)
            t_py_full += t0
            t_py_run += t1
            t_cpp_full += t2
            t_cpp_exp += t3

        # Average time in ns
        ts_py_full[i] = 1e9 * t_py_full / samples
        ts_py_run[i] = 1e9 * t_py_run / samples
        ts_cpp_full[i] = 1e9 * t_cpp_full / samples
        ts_cpp_exp[i] = 1e9 * t_cpp_exp / samples
    
    return ts_py_full, ts_py_run, ts_cpp_full, ts_cpp_exp


# Benchark: X gate on qubit-0
backend_options = {
    # Force Statevector method so stabilizer (clifford) simulator isn't used
    "method": "statevector",
    
    # Disable parallelization
    "max_parallel_threads": 1,
    
    # Stop simulator truncating to 1-qubit circuit simulations
    "truncate_enable": False,  
}  

nqs = list(range(5, 26))
ts_py_full1, ts_py_run1, ts_cpp_full1, ts_cpp_expr1 = benchmark_x(nqs, 1000, backend_options)

plt.semilogy(nqs, ts_py_full1, 'o-', label='Python (full)')
plt.semilogy(nqs, ts_py_run1, 's-', label='Python (run-only)')
plt.semilogy(nqs, ts_cpp_full1, '^-', label='C++ (full)')
plt.semilogy(nqs, ts_cpp_expr1, 'd-', label='C++ (experiment-only)')
plt.legend()
plt.grid()
plt.savefig('aer_x_qasm_sv.pdf')

Here is an example of running the above on my laptop:

aer_x_qasm_sv.pdf

better visualization

currently, we plot everything to a single picture, since this project was mainly used by our paper for Yao.jl. But given the benchmarks are growing dense, it would be necessary to have a clearer visualization on this. Idealy an interactive web page would be great.

BM_sim_QCBM taxes Qrack's use of on-chip RNG

I already have a fix, but the BM_sim_QCBM test currently uses Qrack's simulator random number generator for random rotation angles. This happens to be enough RNG demand to empty the on-chip entropy pool. I'm converting this to use a pseudo-RNG, like the QuEST suite, and I'll add it in #53.

potential incorrect benchmark for qiskit-gpu

Hi @chriseclectic

Thanks for your previous contribution to get the CPU benchmark correct. However, I'd like to check the following result, we recently went through the entire benchmark again since the benchmark result is very strange: the timing merely scales with the number of qubits at all. I suspect when running the benchmark of qiskit-gpu, the cudaDeviceSynchronize is not called. even for 30 qubits, the timing is only 5ms.

on the other side, I didn't find any function call to cudaDeviceSynchronize in the source code either: https://github.com/Qiskit/qiskit-aer/blob/master/src/simulators/statevector/qubitvector_thrust.hpp
I feel it is unlikely that thrust does the sync implicitly, but I could be wrong.

Moreover, the timing is 100x difference from what qulacs and Yao has, since qulacs and Yao are implemented independently, and their benchmark results match each other, thus I believe this problem could exist. But I'd like to get some help from you to confirm this.

FYI: even sum over a vector of size 2^30 in complex<float64> requires 24ms.

Thanks a lot.
Roger

triggering workflow runner?

Hi @Codewithsk, not sure if you are busy recently, but do you know if we can trigger the runners from github now? (like what you demonstrated before) it would be nice to run that script instead of the old ones since the old ones seems not working anymore somehow

cc: @antalszava to keep you update

Qrack/PyQrack?

Sorry, I can't find the original issue that I thought raised the question, but are Qrack or PyQrack ever going to be added to these benchmarks?

Speaking as one of the authors, I'm sorry to press the issue, but it's unequivocal that Qrack performance is constant on single X, H, T, and CNOT gates at any arbitrary qubit width, due primarily to optimization via proactive and reactive Schmidt decomposition, with very loose inspiration taken from Pareto-Efficient Quantum Circuit Simulation Using Tensor Contraction Deferral. Redundantly, Qrack's extended stabilizer simulation capabilities cover these cases at well, but I think the comparison would be fair, because these and other optimizations in the default optimal Qrack "layer stack" are exact and generally universal, without limitation to Clifford group, for example. (Also, this is a case of performance on these benchmarks, not something as unlikely as general constant or linear performance, obviously, which is why Qrack has historically presented representatively general harder cases in our own benchmarks.)

I'm saying, I understand that stabilizer would break these trends without a universal gate set, but this is not the case for these benchmarks run in Qrack, which can perform this way while admitting an exactly simulated universal gate set in the same case. Is the motivation for leaving Qrack benchmarks out based on the assumption that its default optimal settings wouldn't be universal, like with Clifford? (To be clear, they are, though!)

Qrack build missing export.h

Problem

I tried to build the Qrack experiment with sudo ./bin/benchmark setup qrack and it failed.
I opened this issue because I couldn't find any export.h in this project or the qrack project https://github.com/vm6502q/qrack .

Error message

[ 98%] Built target benchmarks
[100%] Linking CXX executable unittest
[100%] Built target unittest
Install the project...
-- Install configuration: ""
-- Installing: /usr/local/include/qrack/common/config.h
...
-- Installing: /usr/local/share/pkgconfig/qrack.pc
-- Installing: /usr/local/bin/qrack_cl_precompile
In file included from benchmarks.cc:1:
benchmark/include/benchmark/benchmark.h:190:10: fatal error: benchmark/export.h: No such file or directory
 #include "benchmark/export.h"
          ^~~~~~~~~~~~~~~~~~~~
compilation terminated.

I guess this issue should be easy to fix. I'll tag @WrathfulSpatula since this issue is related to Qrack.

Why the result of DDsim is removed?

Is that because of the data structure of DDsim?

test

Qiskit/qiskit@44e1318

yardstiq / quantum-benchmarks Goto Github PK

quantum-benchmarks's People

Contributors

Stargazers

Watchers

Forkers

quantum-benchmarks's Issues

Problem

Error message

Recommend Projects

Recommend Topics

Recommend Org