Deep Reinforcement Learning is emerging as a promising approach for the continuous control task of robotic arm movement. However, the challenges of learning robust and versatile control capabilities are still far from being resolved for real-world applications, mainly because of two common issues of this learning paradigm: the exploration strategy and the slow learning speed, sometimes known as "the curse of dimensionality". This work aims at exploring and assessing the advantages of the application of Quantum Computing to one of the state-of-art Reinforcement Learning techniques for continuous control - namely Soft Actor-Critic. Specifically, the performance of a Variational Quantum Soft Actor-Critic on the movement of a virtual robotic arm has been investigated by means of digital simulations of quantum circuits. A quantum advantage over the classical algorithm has been found in terms of a significant decrease in the amount of required parameters for satisfactory model training, paving the way for further promising developments.
翻译:深入强化学习正在成为持续控制机器人手臂运动的一个有希望的方法,然而,学习强大和多功能控制能力的挑战对于现实世界的应用还远远没有解决,这主要是因为这一学习范式的两个共同问题:探索战略和学习速度缓慢,有时被称为“维度诅咒”。 这项工作旨在探索和评估将量子计算应用到最先进强化学习技术中的一个持续控制的好处,即Soft Actor-Critic。 具体地说,通过量子电路的数字模拟对虚拟机器人臂运动的演化进行了调查。 相对于传统算法而言,一个巨大的优势在于令人满意模型培训所需的参数数量大幅下降,为进一步有希望的发展铺平了道路。