Moving object segmentation is a crucial task for autonomous vehicles as it can be used to segment objects in a class agnostic manner based on its motion cues. It will enable the detection of objects unseen during training (e.g., moose or a construction truck) generically based on their motion. Although pixel-wise motion segmentation has been studied in the literature, it is not dealt with at instance level, which would help separate connected segments of moving objects leading to better trajectory planning. In this paper, we proposed a motion-based instance segmentation task and created a new annotated dataset based on KITTI, which will be released publicly. We make use of the YOLACT model to solve the instance motion segmentation network by feeding inflow and image as input and instance motion masks as output. We extend it to a multi-task model that learns semantic and motion instance segmentation in a computationally efficient manner. Our model is based on sharing a prototype generation network between the two tasks and learning separate prototype coefficients per task. To obtain real-time performance, we study different efficient encoders and obtain 39 fps on a Titan Xp GPU using MobileNetV2 with an improvement of 10% mAP relative to the baseline. A video demonstration of our work is available in https://youtu.be/CWGZibugD9g.
翻译:移动对象分割是自动车辆的一项关键任务, 因为它可以用运动提示作为分类的分解工具, 根据运动提示, 分解物体 。 它将能够根据运动一般的动作, 检测训练期间看不见的物体( 如驼鹿或建筑卡车 ) 。 虽然文献中已经研究了像素一样的运动分解, 但是没有在实例一级加以处理, 这有助于将移动物体的相连接部分分离, 从而导致更好的轨迹规划 。 在本文中, 我们提议了一个基于运动的分解任务, 并创建了一个基于 KITTI 的附加说明的新数据集, 并将该数据集公开发布。 我们使用 YOLACT 模型, 通过输入流入和图像作为输入和实例遮罩, 来解析实例分解网络中未见的物体 。 我们把它推广到一个多功能模型, 能够以计算高效的方式, 帮助移动对象分离导致更好的轨迹规划。 我们的模型基于在两项任务之间共享原型生成网络, 和学习每个任务不同的原型系数。 为了取得实时性表现, 我们研究不同的高效率的解算器, 并获得39 fpspsyp2 用于TAN XPOPOVTAN 的演示。 10PUPOVULOVULA