Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems. Large neural networks employed in the framework are traditionally associated with better generalization capabilities, but their increased size entails the drawbacks of extensive training duration, substantial hardware resources, and longer inference times. One way to tackle this problem is to prune neural networks leaving only the necessary parameters. State-of-the-art concurrent pruning techniques for imposing sparsity perform demonstrably well in applications where data distributions are fixed. However, they have not yet been substantially explored in the context of RL. We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL. We leverage a fixed dataset to prune neural networks before the start of RL training. We then run experiments varying the network sparsity level and evaluating the validity of pruning at initialization techniques in continuous control tasks. Our results show that with 95% of the network weights pruned, Offline-RL algorithms can still retain performance in the majority of our experiments. To the best of our knowledge, no prior work utilizing pruning in RL retained performance at such high levels of sparsity. Moreover, pruning at initialization techniques can be easily integrated into any existing Offline-RL algorithms without changing the learning objective.
翻译:深度强化学习(RL)是解决复杂的现实世界问题的有力框架。在这个框架中使用的大型神经网络传统上与更概括化的能力有关,但其规模的扩大带来了广泛的培训期限、大量硬件资源和较长的推断时间的缺点。 解决这一问题的一个办法是光线网络仅留下必要的参数。 实施宽度的同步运行技术在数据分布固定的应用程序中表现得非常明显。 但是,在RL背景下,它们尚未进行大量探索。 我们缩小了RL和单发裁剪技术之间的差距,并对Offline RL提出了总体的剪裁方法。 在RL培训开始前,我们利用固定的数据集对光线神经网络进行精密处理。 然后我们试验网络的宽度水平,评估在连续控制任务中对初始化技术进行剪裁的有效性。我们的结果显示,在95%的网络重量被扎根的情况下,离线算法仍然可以在我们的实验中保留大部分的性能。 在RL培训开始前,我们利用固定的固定数据组进行精度,在最短的初始性水平上,没有保留任何现有高级的运行水平。