In this paper, we focus on improving binary 2D instance segmentation to assist humans in labeling ground truth datasets with polygons. Humans labeler just have to draw boxes around objects, and polygons are generated automatically. To be useful, our system has to run on CPUs in real-time. The most usual approach for binary instance segmentation involves encoder-decoder networks. This report evaluates state-of-the-art encoder-decoder networks and proposes a method for improving instance segmentation quality using these networks. Alongside network architecture improvements, our proposed method relies upon providing extra information to the network input, so-called extreme points, i.e. the outermost points on the object silhouette. The user can label them instead of a bounding box almost as quickly. The bounding box can be deduced from the extreme points as well. This method produces better IoU compared to other state-of-the-art encoder-decoder networks and also runs fast enough when it is deployed on a CPU.
翻译:在本文中, 我们侧重于改进二维实例分割法, 以帮助人类用多边形对地面真相数据集进行标签。 人类标签器只需在对象周围绘制框, 并自动生成多边形 。 要有用的话, 我们的系统必须实时运行在 CPU 上。 二维实例分割法最常用的方法涉及编码器- 解码器网络 。 此报告评估了最先进的编码器- 解码器网络, 并提出了使用这些网络改进实例分割质量的方法 。 除了网络结构改进外, 我们提议的方法还依赖于为网络输入提供额外信息, 即所谓的极端点, 即天体环的外端点。 用户可以几乎快速地将其标记在 CPU 上 。 边框可以从极端点中推断出来。 此方法比其他最先进的编码器- 解码网络产生更好的 IoU 。 当它被安装在 CPU 上时, 运行速度也足够快 。