Instance segmentation has gained recently huge attention in various computer vision applications. It aims at providing different IDs to different objects of the scene, even if they belong to the same class. Instance segmentation is usually performed as a two-stage pipeline. First, an object is detected, then semantic segmentation within the detected box area is performed which involves costly up-sampling. In this paper, we propose Insta-YOLO, a novel one-stage end-to-end deep learning model for real-time instance segmentation. Instead of pixel-wise prediction, our model predicts instances as object contours represented by 2D points in Cartesian space. We evaluate our model on three datasets, namely, Carvana,Cityscapes and Airbus. We compare our results to the state-of-the-art models for instance segmentation. The results show our model achieves competitive accuracy in terms of mAP at twice the speed on GTX-1080 GPU.
翻译:最近,在各种计算机视觉应用中,各例分解最近引起了极大的注意。 它旨在为场景的不同对象提供不同的识别码, 即使它们属于同一等级。 例分解通常以两阶段管道的形式进行。 首先, 检测到一个对象, 然后在检测到的框区内进行静语分解, 需要花费昂贵的上层取样。 在本文中, 我们提议Insta- YOLO, 这是一种新型的一至端至端深层学习模型, 用于实时分解。 我们的模型预测不是像素预测, 而是以 Cartesian 空间的 2D 点表示的物体轮廓。 我们用三种数据集, 即 Carvana、 Cityscorps 和 Airbus 来评估我们的模型。 我们比较了我们的结果与最先进的模型相比, 例如分解。 结果显示我们的模型在 mAP 上实现了竞争的准确性, 速度是GTX- 1080 GPU 的速度的两倍 。