Dynamic sparse training (DST) literature demonstrates that a highly sparse neural network can match the performance of its corresponding dense network in supervised and unsupervised learning when it is trained from scratch while substantially reducing the computational and memory costs. In this paper, we show for the first time that deep reinforcement learning can also benefit from dynamic sparse training. We demonstrate that DST can be leveraged to decrease the long training time required by deep reinforcement learning agents without sacrificing performance. To achieve this, we propose a DST algorithm that adapts to the online nature and instability of the deep reinforcement learning paradigm. We integrate our proposed algorithm with state-of-the-art deep reinforcement learning methods. Experimental results demonstrate that our dynamic sparse compact agents can effectively learn and achieve higher performance than the original dense methods while reducing the parameter count and floating-point operations (FLOPs) by 50%. More impressively, our dynamic sparse agents have a faster learning speed. They can reach the final performance achieved by dense agents after 40-50% of the steps required by the latter. We evaluate our approach on OpenAI gym continuous control tasks.
翻译:动态稀少的培训(DST)文献表明,高度稀少的神经网络在从零开始接受监督和不受监督的学习时,可以与其相应的密集网络在监督和不受监督的学习中的性能相匹配,同时大大减少计算和记忆成本。在本文中,我们第一次表明,深度强化学习也可以从动态稀少的培训中受益。我们证明,可以利用DST来减少深强化学习机构所需的长期培训时间,而不必牺牲业绩。为了实现这一目标,我们提议了一个适应深度强化学习模式的在线性质和不稳定性的DST算法。我们将我们提议的算法与最先进的深层强化学习方法结合起来。实验结果显示,我们动态的稀释紧凑剂可以有效地学习并取得比原始密集方法更高的性能,同时将参数计数和浮点操作减少50%。更令人印象深刻的是,我们的动态稀疏导剂的学习速度更快。他们可以在40-50%的强化学习步骤之后达到密度强的剂的最后性能。我们评估了我们关于OpenAI健身房连续控制任务的方法。