We develop a high-performance tensor-based simulator for random quantum circuits(RQCs) on the new Sunway supercomputer. Our major innovations include: (1) a near-optimal slicing scheme, and a path-optimization strategy that considers both complexity and compute density; (2) a three-level parallelization scheme that scales to about 42 million cores; (3) a fused permutation and multiplication design that improves the compute efficiency for a wide range of tensor contraction scenarios; and (4) a mixed-precision scheme to further improve the performance. Our simulator effectively expands the scope of simulatable RQCs to include the 10*10(qubits)*(1+40+1)(depth) circuit, with a sustained performance of 1.2 Eflops (single-precision), or 4.4 Eflops (mixed-precision)as a new milestone for classical simulation of quantum circuits; and reduces the simulation sampling time of Google Sycamore to 304 seconds, from the previously claimed 10,000 years.
翻译:我们在新的Sunway超级计算机上为随机量子电路(RQC)开发了高性能高压模拟器(RQC),我们的主要创新包括:(1) 近最佳剪切办法,以及既考虑复杂程度又计算密度的路径优化战略;(2) 三级平行办法,其规模为约4,200万个核心;(3) 结合的混合和乘法设计,提高一系列长子收缩情景的计算效率;(4) 混合精度计划,以进一步提高性能。我们的模拟器有效地扩大了模拟性能RQC的范围,将10*10(qubits)*(1+40+1)(深度)电路包括进去,持续运行1.2 Eflops(single-precision)或4.4 Efloops(mixed-precision),作为典型量子电路模拟的新里程碑;以及将Google Sycamore的模拟取样时间从先前声称的10,000年减少到304秒。