By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io
翻译:现代机器学习模型通过从大型、多样、任务不可知数据集转让知识,可以解决具体的下游任务,要么是零点,要么是小型任务数据集,达到很高的性能水平。虽然这种能力已在计算机视觉、自然语言处理或语音识别等其他领域得到证明,但由于难以收集真实世界机器人数据,模型的概括化能力特别关键。我们认为,这种普通机器人模型成功的关键之一在于开放性任务不可知性培训,加上能够吸收各种机器人数据的高能力结构。在本文中,我们展示了一个模型类,称为机器人变形器,展示了有希望的模型属性。我们在对不同模型类的研究中核实了我们的结论,以及它们作为数据大小、模型大小和数据多样性的一个函数,其依据是执行现实世界任务的真正机器人的大规模数据收集。该项目的网站和视频可以在机器人转换器上找到。githhubiub。