Deep reinforcement learning has achieved great success in laser-based collision avoidance work because the laser can sense accurate depth information without too much redundant data, which can maintain the robustness of the algorithm when it is migrated from the simulation environment to the real world. However, high-cost laser devices are not only difficult to apply on a large scale but also have poor robustness to irregular objects, e.g., tables, chairs, shelves, etc. In this paper, we propose a vision-based collision avoidance framework to solve the challenging problem. Our method attempts to estimate the depth and incorporate the semantic information from RGB data to obtain a new form of data, pseudo-laser data, which combines the advantages of visual information and laser information. Compared to traditional laser data that only contains the one-dimensional distance information captured at a certain height, our proposed pseudo-laser data encodes the depth information and semantic information within the image, which makes our method more effective for irregular obstacles. Besides, we adaptively add noise to the laser data during the training stage to increase the robustness of our model in the real world, due to the estimated depth information is not accurate. Experimental results show that our framework achieves state-of-the-art performance in several unseen virtual and real-world scenarios.
翻译:深层强化学习在以激光为基础的避免碰撞工作中取得了巨大成功,因为激光能够感知准确的深度信息,而没有太多冗余数据,从而能够在从模拟环境向现实世界迁移时保持算法的稳健性;然而,高成本激光装置不仅难以大规模应用,而且对非常规物体,例如桌子、椅子、架子等,也缺乏稳健性。在本文件中,我们提出了一个基于愿景的避免碰撞框架,以解决具有挑战性的问题。我们试图在培训阶段对RGB数据中的深度进行估计并纳入语义信息,以获得一种新的数据形式,即假激光数据,将视觉信息和激光信息的优势结合起来。与仅包含在一定高度捕获的一维远程信息的传统激光数据相比,我们提议的假激光装置数据将图像中的深度信息和语义信息编码为深度和语义信息。此外,在培训阶段,我们试图在激光数据中添加噪音,以提高我们模型在现实世界中的稳健度,因为估计深度信息和激光信息的优点是无法准确的。