Understanding the power and limitations of quantum access to data in machine learning tasks is primordial to assess the potential of quantum computing in artificial intelligence. Previous works have already shown that speed-ups in learning are possible when given quantum access to reinforcement learning environments. Yet, the applicability of quantum algorithms in this setting remains very limited, notably in environments with large state and action spaces. In this work, we design quantum algorithms to train state-of-the-art reinforcement learning policies by exploiting quantum interactions with an environment. However, these algorithms only offer full quadratic speed-ups in sample complexity over their classical analogs when the trained policies satisfy some regularity conditions. Interestingly, we find that reinforcement learning policies derived from parametrized quantum circuits are well-behaved with respect to these conditions, which showcases the benefit of a fully-quantum reinforcement learning framework.
翻译:在机器学习任务中,了解量子访问数据的能力和局限性是评估人工智能中量子计算潜力的首要条件。以前的工作已经表明,在获得量子访问强化学习环境的机会时,可以加快学习速度。然而,量子算法在这种环境中的适用性仍然非常有限,特别是在有较大状态和行动空间的环境中。在这项工作中,我们设计量子算法,通过利用量子与环境的相互作用来培训最先进的强化学习政策。然而,这些算法只有在经过培训的政策符合某些常规条件时,才能提供优于其古典模拟的样本复杂性的全二次加速。有趣的是,我们发现从配制量子电路得出的强化学习政策在这些条件下是完善的,显示了充分量子强化学习框架的好处。