Self-supervised learning and data augmentation have significantly reduced the performance gap between state and image-based reinforcement learning agents in continuous control tasks. However, it is still unclear whether current techniques can face a variety of visual conditions required by real-world environments. We propose a challenging benchmark that tests agents' visual generalization by adding graphical variety to existing continuous control domains. Our empirical analysis shows that current methods struggle to generalize across a diverse set of visual changes, and we examine the specific factors of variation that make these tasks difficult. We find that data augmentation techniques outperform self-supervised learning approaches and that more significant image transformations provide better visual generalization \footnote{The benchmark and our augmented actor-critic implementation are open-sourced @ https://github.com/QData/dmc_remastered)
翻译:自我监督的学习和数据增强极大地缩小了国家和基于图像的强化学习剂之间在连续控制任务中的绩效差距。 但是,目前技术能否面对现实世界环境所要求的各种视觉条件还不清楚。 我们提出了一个挑战性的基准,通过在现有连续控制域中增加图形多样性来测试代理人的视觉一般化。 我们的经验分析表明,目前的方法难以在一系列不同的视觉变化中一概而论,我们研究了使这些任务难以完成的具体变异因素。 我们发现,数据增强技术超越了自我监督的学习方法,而更重要的图像转换提供了更好的视觉化\ footote{基准和我们扩大的演员-critic 执行是开放的 https://github.com/QData/dmc_remastered)