Shape and pose estimation is a critical perception problem for a self-driving car to fully understand its surrounding environment. One fundamental challenge in solving this problem is the incomplete sensor signal (e.g., LiDAR scans), especially for faraway or occluded objects. In this paper, we propose a novel algorithm to address this challenge, which explicitly leverages the sensor signal captured over consecutive time: the consecutive signals can provide more information about an object, including different viewpoints and its motion. By encoding the consecutive signals via a recurrent neural network, not only our algorithm improves the shape and pose estimates, but also produces a labeling tool that can benefit other tasks in autonomous driving research. Specifically, building upon our algorithm, we propose a novel pipeline to automatically annotate high-quality labels for amodal segmentation on images, which are hard and laborious to annotate manually. Our code and data will be made publicly available.
翻译:形状和形状估计是自我驱动汽车充分了解周围环境的关键认识问题。 解决这一问题的一个基本挑战是传感器信号不完整(如LIDAR扫描),特别是远方或隐蔽物体的传感器信号不完整(如LIDAR扫描 ) 。 在本文中,我们提出了应对这一挑战的新型算法。 新的算法明确利用连续时间所捕捉的传感器信号:连续的信号可以提供关于一个物体的更多信息,包括不同观点及其运动。通过一个经常性神经网络编码连续的信号,不仅我们的算法可以改善形状和提出估计,而且还可以产生一个标签工具,有利于自主驱动研究中的其他任务。具体地说,在我们的算法的基础上,我们建议了一条新的管道,自动注解高品质的标签,用于图像上的调制式分割,而图像对人工注来说既难又费力。我们的代码和数据将公开提供。