In this paper, we propose a Visual Teach and Repeat (VTR) algorithm using semantic landmarks extracted from environmental objects for ground robots with fixed mount monocular cameras. The proposed algorithm is robust to changes in the starting pose of the camera/robot, where a pose is defined as the planar position plus the orientation around the vertical axis. VTR consists of a teach phase in which a robot moves in a prescribed path, and a repeat phase in which the robot tries to repeat the same path starting from the same or a different pose. Most available VTR algorithms are pose dependent and cannot perform well in the repeat phase when starting from an initial pose far from that of the teach phase. To achieve more robust pose independency, during the teach phase, we collect the camera poses and the 3D point clouds of the environment using ORB-SLAM. We also detect objects in the environment using YOLOv3. We then combine the two outputs to build a 3D semantic map of the environment consisting of the 3D position of the objects and the robot path. In the repeat phase, we relocalize the robot based on the detected objects and the stored semantic map. The robot is then able to move toward the teach path, and repeat it in both forward and backward directions. The results show that our algorithm is highly robust with respect to pose variations as well as environmental alterations. Our code and data are available at the following Github page: https://github.com/mmahdavian/semantic_visual_teach_repeat
翻译:在本文中, 我们提出一个视觉教学和重复( VTR) 算法, 使用从环境物体中提取的语义标志, 用于使用固定的单层相机的地面机器人。 提议的算法对相机/ robot 的初始面貌变化非常有力, 其表面被定义为平面位置, 以及垂直轴周围的方向。 VTR 包含一个教学阶段, 机器人在其中以指定路径移动, 以及一个重复阶段, 机器人试图重复从同一或不同姿势开始的同一路径。 大多数可用的 VTR 算法都具有依赖性, 并且无法在重复阶段运行良好。 在重复阶段, 我们从最初的姿势开始, 远离教学阶段, 使相机/ roboat 的最初姿势更加稳健。 在教学阶段, 我们用 ORB- SLAM 收集相机的姿势和环境的3D点云。 我们还用YOLOv3 来探测环境中的物体。 我们然后将两个输出结果结合起来, 以建立由 3D robeb 对象和机器人路径构成的 3Dregiard 位置构成环境图 。 然后阶段, 我们将机器人重新定位的机器人重新定位在被检测的路径上进行定位, 并显示的路径和 。 。 和深层的路径显示的路径和深层图显示。