As the third-generation neural networks, Spiking Neural Networks (SNNs) have great potential on neuromorphic hardware because of their high energy-efficiency. However, Deep Spiking Reinforcement Learning (DSRL), i.e., the Reinforcement Learning (RL) based on SNNs, is still in its preliminary stage due to the binary output and the non-differentiable property of the spiking function. To address these issues, we propose a Deep Spiking Q-Network (DSQN) in this paper. Specifically, we propose a directly-trained deep spiking reinforcement learning architecture based on the Leaky Integrate-and-Fire (LIF) neurons and Deep Q-Network (DQN). Then, we adapt a direct spiking learning algorithm for the Deep Spiking Q-Network. We further demonstrate the advantages of using LIF neurons in DSQN theoretically. Comprehensive experiments have been conducted on 17 top-performing Atari games to compare our method with the state-of-the-art conversion method. The experimental results demonstrate the superiority of our method in terms of performance, stability, robustness and energy-efficiency. To the best of our knowledge, our work is the first one to achieve state-of-the-art performance on multiple Atari games with the directly-trained SNN.
翻译:作为第三代神经网络,脉冲神经网络(SNNs)由于其高能效性,在神经形态硬件上具有巨大潜力。然而,基于SNN的深度脉冲强化学习(DSRL)仍处于初步阶段,因为脉冲函数的二元输出和不可微分性。为了解决这些问题,在本文中,我们提出了一个深度脉冲Q网络(DSQN)。具体而言,我们提出了一种基于渗漏整合和发放(LIF)神经元和深度Q网络(DQN)的直接训练深度脉冲强化学习架构。然后,我们为深度脉冲Q网络适应了一种直接脉冲学习算法。我们进一步从理论上证明了在DSQN中使用LIF神经元的优势。我们在17个表现最佳的Atari游戏上进行了广泛的实验,以比较我们的方法和最先进的转换方法。实验结果证明了我们的方法在性能、稳定性、鲁棒性和能效方面的优越性。据我们所知,我们的工作是第一个在直接训练的SNN上实现多个Atari游戏的最先进性能的示例。