Robustly classifying ground infrastructure such as roads and street crossings is an essential task for mobile robots operating alongside pedestrians. While many semantic segmentation datasets are available for autonomous vehicles, models trained on such datasets exhibit a large domain gap when deployed on robots operating in pedestrian spaces. Manually annotating images recorded from pedestrian viewpoints is both expensive and time-consuming. To overcome this challenge, we propose TrackletMapper, a framework for annotating ground surface types such as sidewalks, roads, and street crossings from object tracklets without requiring human-annotated data. To this end, we project the robot ego-trajectory and the paths of other traffic participants into the ego-view camera images, creating sparse semantic annotations for multiple types of ground surfaces from which a ground segmentation model can be trained. We further show that the model can be self-distilled for additional performance benefits by aggregating a ground surface map and projecting it into the camera images, creating a denser set of training annotations compared to the sparse tracklet annotations. We qualitatively and quantitatively attest our findings on a novel large-scale dataset for mobile robots operating in pedestrian areas. Code and dataset will be made available at http://trackletmapper.cs.uni-freiburg.de.
翻译:对行人和行人一起操作的移动机器人而言,对道路和街道交叉路口等地面基础设施进行严格分类是一项基本任务。虽然为自主车辆提供了许多语义分割数据集,但是在行人空间操作的机器人上部署的这类数据集培训模型显示出巨大的域间差距。从行人角度拍摄的人工说明图像既昂贵又费时。为了克服这一挑战,我们提议ChartletMapper,这是一个在不需要人注解的数据的情况下将人行道、道路和物体轨迹的街道交叉路口等地面类型作说明的框架。为此,我们将机器人自导体和其他交通参与者的路径投放到自视相机图像中,为多种类型的地面表面制作稀有的语义说明,从中可以对地面分割模型进行培训。我们进一步表明,通过将地面地图汇总并将其投射到相机图像中,可以自行筛选出更多的性能效益,与稀有的轨迹标记相比,可以产生更密集的训练说明。我们定性和定量地证明我们在新型的大型数据轨迹图中发现我们的发现。在移动数据轨道上,将可操作的轨道上进行。