Recent advances in quantum computing (QC) and machine learning (ML) have drawn significant attention to the development of quantum machine learning (QML). Reinforcement learning (RL) is one of the ML paradigms which can be used to solve complex sequential decision making problems. Classical RL has been shown to be capable to solve various challenging tasks. However, RL algorithms in the quantum world are still in their infancy. One of the challenges yet to solve is how to train quantum RL in the partially observable environments. In this paper, we approach this challenge through building QRL agents with quantum recurrent neural networks (QRNN). Specifically, we choose the quantum long short-term memory (QLSTM) to be the core of the QRL agent and train the whole model with deep $Q$-learning. We demonstrate the results via numerical simulations that the QLSTM-DRQN can solve standard benchmark such as Cart-Pole with more stable and higher average scores than classical DRQN with similar architecture and number of model parameters.
 翻译:量子计算(QC)和机器学习(ML)方面的最新进展引起了对量子机器学习(QML)发展的极大关注。强化学习(RL)是可用于解决复杂的连续决策问题的ML模式之一。古典RL已证明能够解决各种具有挑战性的任务。然而,量子世界的RL算法仍然处于萌芽阶段。有待解决的挑战之一是如何在部分可观测的环境中培训量子RL。在本文中,我们通过建立量子经常性神经网络(QRNN)来应对这一挑战。具体地说,我们选择量子短期内存(QLSTM)作为QRL代理的核心,并用深重的$Q学习来培训整个模型。我们通过数字模拟来证明,QLSTM-DRQN能够解决卡托尔等标准基准,其平均分数比典型的DRQN平均分数更稳定、更高,且有相似的结构和模式参数的数目。