Recent advance in classical reinforcement learning (RL) and quantum computation (QC) points to a promising direction of performing RL on a quantum computer. However, potential applications in quantum RL are limited by the number of qubits available in the modern quantum devices. Here we present two frameworks of deep quantum RL tasks using a gradient-free evolution optimization: First, we apply the amplitude encoding scheme to the Cart-Pole problem; Second, we propose a hybrid framework where the quantum RL agents are equipped with hybrid tensor network-variational quantum circuit (TN-VQC) architecture to handle inputs with dimensions exceeding the number of qubits. This allows us to perform quantum RL on the MiniGrid environment with 147-dimensional inputs. We demonstrate the quantum advantage of parameter saving using the amplitude encoding. The hybrid TN-VQC architecture provides a natural way to perform efficient compression of the input dimension, enabling further quantum RL applications on noisy intermediate-scale quantum devices.
翻译:古典强化学习(RL)和量子计算(QC)最近的进展表明,在量子计算机上执行RL是一个有希望的方向。然而,量子RL的潜在应用受现代量子设备可用量子数的限制。这里我们展示了使用无梯度进化优化的深量RL任务的两个框架:首先,我们将振幅编码办法应用于Cart-Pole问题;第二,我们提议了一个混合框架,在这个框架中,量子RL代理器配备了混合抗拉网络-量子变换电路(TN-VQC)结构,以处理尺寸超过qubit的输入。这使我们能够在MiniGrid环境中用147维输入量值RL。我们展示了使用缩放编码保存参数的量子优势。混合TN-VQC结构为高效压缩输入维度提供了自然的方法,使得在噪音中尺度的量子装置上能够进一步应用量子RL。