Quantum reinforcement learning (QRL) is one promising algorithm proposed for near-term quantum devices. Early QRL proposals are effective at solving problems in discrete action space, but often suffer from the curse of dimensionality in the continuous domain due to discretization. To address this problem, we propose a quantum Deep Deterministic Policy Gradient algorithm that is efficient at solving both classical and quantum sequential decision problems in the continuous domain. As an application, our method can solve the quantum state-generation problem in a single shot: it only requires a one-shot optimization to generate a model that outputs the desired control sequence for arbitrary target state. In comparison, the standard quantum control method requires optimizing for each target state. Moreover, our method can also be used to physically reconstruct an unknown quantum state.
翻译:量子加固学习(QRL)是为近期量子装置提出的一种有希望的算法。早期量子加固建议(QRL)对于解决离散行动空间的问题非常有效,但由于离散,在连续域中往往会受到维度的诅咒。为了解决这个问题,我们提出了一个量子深确定性政策梯度算法,这个算法能够有效地解决连续域的古典和量子相继决定问题。作为一个应用,我们的方法可以一次性解决量子状态生成问题:它只需要一次性优化来生成一个模型,以产生任意目标状态所需的控制序列。相比之下,标准的量子控制方法需要为每个目标国优化。此外,我们的方法也可以用于实际重建未知量子状态。