在反糖尿病驾驶的启发下,为许多地面国家做好多功能地面准备进行强化学习 (Reinforcement Learning for Many-Body Ground-State Preparation Inspired by Counterdiabatic Driving)

The quantum alternating operator ansatz (QAOA) is a prominent example of variational quantum algorithms. We propose a generalized QAOA called CD-QAOA, which is inspired by the counterdiabatic driving procedure, designed for quantum many-body systems and optimized using a reinforcement learning (RL) approach. The resulting hybrid control algorithm proves versatile in preparing the ground state of quantum-chaotic many-body spin chains by minimizing the energy. We show that using terms occurring in the adiabatic gauge potential as generators of additional control unitaries, it is possible to achieve fast high-fidelity many-body control away from the adiabatic regime. While each unitary retains the conventional QAOA-intrinsic continuous control degree of freedom such as the time duration, we consider the order of the multiple available unitaries appearing in the control sequence as an additional discrete optimization problem. Endowing the policy gradient algorithm with an autoregressive deep learning architecture to capture causality, we train the RL agent to construct optimal sequences of unitaries. The algorithm has no access to the quantum state, and we find that the protocol learned on small systems may generalize to larger systems. By scanning a range of protocol durations, we present numerical evidence for a finite quantum speed limit in the nonintegrable mixed-field spin-1/2 Ising and Lipkin-Meshkov-Glick models, and for the suitability to prepare ground states of the spin-1 Heisenberg chain in the long-range and topologically ordered parameter regimes. This work paves the way to incorporate recent success from deep learning for the purpose of quantum many-body control.

翻译：交替运算器 ansatz (QAOA) 是变异量算法的一个突出例子。我们提出一个名为 CD- QAOA 的通用QAOA, 其灵感来自反异性驱动程序, 设计用于量子多体系统, 并采用强化学习( RL) 方法优化。由此产生的混合控制算法在通过将能量最小化来为量子相交多体旋转链的地面状态做准备方面证明是多功能的。我们显示, 使用非异性测量值潜力中出现的术语作为其他控制单位的生成器, 有可能从深异性系统中实现快速的高度异性多体控制 CD- QAOAA- QAOAAAAAAA Arincial 自由度的反反异性驱动程序。虽然每个单体驱动程序都保留常规的QAOAAA- OA- 异性持续控制自由度, 如时间长度, 我们认为在控制序列中出现的多个可用单位的顺序是一个额外的离子优化优化优化的优化的优化算法。我们发现, 直流体操作系统在直基协议系统上可以学习到直径。