Visual reinforcement learning (RL), which makes decisions directly from high-dimensional visual inputs, has demonstrated significant potential in various domains. However, deploying visual RL techniques in the real world remains challenging due to their low sample efficiency and large generalization gaps. To tackle these obstacles, data augmentation (DA) has become a widely used technique in visual RL for acquiring sample-efficient and generalizable policies by diversifying the training data. This survey aims to provide a timely and essential review of DA techniques in visual RL in recognition of the thriving development in this field. In particular, we propose a unified framework for analyzing visual RL and understanding the role of DA in it. We then present a principled taxonomy of the existing augmentation techniques used in visual RL and conduct an in-depth discussion on how to better leverage augmented data in different scenarios. Moreover, we report a systematic empirical evaluation of DA-based techniques in visual RL and conclude by highlighting the directions for future research. As the first comprehensive survey of DA in visual RL, this work is expected to offer valuable guidance to this emerging field.
翻译:视觉强化学习(RL)直接通过高维视觉投入作出决定,在各个领域显示出巨大的潜力;然而,在现实世界中部署视觉RL技术仍然具有挑战性,因为其抽样效率低,而且存在广泛的概括性差距;为克服这些障碍,数据增强(DA)已成为视觉RL中广泛使用的一种技术,通过培训数据多样化获得具有抽样效率和可概括性的政策;这项调查旨在对视觉RL中的DA技术进行及时和必要的审查,以确认该领域的蓬勃发展;特别是,我们提议一个统一的框架,用于分析视觉RL,并了解DA在其中的作用;然后,我们提出在视觉RL中使用的现有增强技术的原则分类,并就如何在不同情况下更好地利用扩大的数据进行深入讨论;此外,我们报告在视觉RL中对基于DA的技术进行系统的实证评价,并通过突出未来研究的方向来结束。作为视觉RL对DA的第一次全面调查,预计这项工作将为这个新兴领域提供宝贵的指导。