We investigate policy transfer using image-to-semantics translation to mitigate learning difficulties in vision-based robotics control agents. This problem assumes two environments: a simulator environment with semantics, that is, low-dimensional and essential information, as the state space, and a real-world environment with images as the state space. By learning mapping from images to semantics, we can transfer a policy, pre-trained in the simulator, to the real world, thereby eliminating real-world on-policy agent interactions to learn, which are costly and risky. In addition, using image-to-semantics mapping is advantageous in terms of the computational efficiency to train the policy and the interpretability of the obtained policy over other types of sim-to-real transfer strategies. To tackle the main difficulty in learning image-to-semantics mapping, namely the human annotation cost for producing a training dataset, we propose two techniques: pair augmentation with the transition function in the simulator environment and active learning. We observed a reduction in the annotation cost without a decline in the performance of the transfer, and the proposed approach outperformed the existing approach without annotation.
翻译:我们用图像到语义翻译来调查政策转移,以减轻基于视觉的机器人控制剂的学习困难。 这一问题假设了两种环境:一个模拟环境,具有语义学,即作为国家空间的低维和基本信息,以及一个以图像作为国家空间的现实世界环境。 通过从图像到语义学的绘图,我们可以将一个在模拟器中预先培训过的政策转移到现实世界,从而消除真实世界的政策代理互动以学习,而这种互动既昂贵又危险。 此外,使用图像到语义学绘图在计算效率方面是有利的,可以用来培训政策,并解释获得的政策对其他类型的模拟到真实传输战略的可解释性。为了解决学习图像到语义学制图的主要困难,即制作培训数据集的人类注资成本,我们提出了两种技术:结合模拟环境中的过渡功能和积极学习。我们观察到了在不降低音义学成本的同时不降低转让的性能,而拟议的方法超越了现有方法。