Stixels have been successfully applied to a wide range of vision tasks in autonomous driving, recently including instance segmentation. However, due to their sparse occurrence in the image, until now Stixels seldomly served as input for Deep Learning algorithms, restricting their utility for such approaches. In this work we present StixelPointNet, a novel method to perform fast instance segmentation directly on Stixels. By regarding the Stixel representation as unstructured data similar to point clouds, architectures like PointNet are able to learn features from Stixels. We use a bounding box detector to propose candidate instances, for which the relevant Stixels are extracted from the input image. On these Stixels, a PointNet models learns binary segmentations, which we then unify throughout the whole image in a final selection step. StixelPointNet achieves state-of-the-art performance on Stixel-level, is considerably faster than pixel-based segmentation methods, and shows that with our approach the Stixel domain can be introduced to many new 3D Deep Learning tasks.
翻译:自动驱动中的各种视觉任务已经成功地应用了像素, 最近包括实例分割。 但是, 由于图像中很少出现, Stixel 至今很少成为深学习算法的输入, 限制了它们对于这些方法的实用性。 在这项工作中, 我们展示了 StixelPointNet, 这是直接在像素上进行快速切分解的一种新颖方法。 将Stixel 表示方式描述为与点云相似的无结构数据, 像 PpointNet 这样的结构能够从像素中学习特性。 我们使用一个捆绑式的盒子探测器来提出候选实例, 而这些实例的相关像素是从输入图像中提取的。 在这些 Stixel 上, 一个点网模型学习二元分解, 然后我们在最后的选择步骤中将整个图像统一起来。 StixelPointNet 在像素水平上达到最新艺术表现速度, 比 以像素为基础的分解方法要快得多, 并且表明, 我们的方法可以将Stixel 域引入许多新的三维深学习任务 。