Learning data representations that are useful for various downstream tasks is a cornerstone of artificial intelligence. While existing methods are typically evaluated on downstream tasks such as classification or generative image quality, we propose to assess representations through their usefulness in downstream control tasks, such as reaching or pushing objects. By training over 10,000 reinforcement learning policies, we extensively evaluate to what extent different representation properties affect out-of-distribution (OOD) generalization. Finally, we demonstrate zero-shot transfer of these policies from simulation to the real world, without any domain randomization or fine-tuning. This paper aims to establish the first systematic characterization of the usefulness of learned representations for real-world OOD downstream tasks.
翻译:对各种下游任务有用的学习数据表述是人工智能的基石,虽然现有方法通常在下游任务(如分类或基因化图像质量)上进行评估,但我们建议通过在下游控制任务(如达到或推动物体)中的有用性来评估其表述;通过培训超过10 000项强化学习政策,我们广泛评估不同表述特性对分配范围外一般化的影响程度;最后,我们展示了这些政策从模拟到现实世界的零光转换,而没有任何域随机化或微调;本文件旨在确定对实际世界OOOD下游任务所了解的表述的首次系统性描述。