Image segmentation needs both local boundary position information and global object context information. The performance of the recent state-of-the-art method, fully convolutional networks, reaches a bottleneck due to the neural network limit after balancing between the two types of information simultaneously in an end-to-end training style. To overcome this problem, we divide the semantic image segmentation into temporal subtasks. First, we find a possible pixel position of some object boundary; then trace the boundary at steps within a limited length until the whole object is outlined. We present the first deep reinforcement learning approach to semantic image segmentation, called DeepOutline, which outperforms other algorithms in Coco detection leaderboard in the middle and large size person category in Coco val2017 dataset. Meanwhile, it provides an insight into a divide and conquer way by reinforcement learning on computer vision problems.
翻译:图像分割需要本地边界位置信息和全球对象背景信息。 最新的最先进方法( 完全连动网络) 的性能由于神经网络的极限而达到瓶颈, 在端到端培训风格中同时平衡两种类型的信息之后, 我们通过端到端培训风格来解决这个问题。 我们将语义图像分割分为时间子任务。 首先, 我们找到某些对象边界的可能像素位置; 然后在有限的长度内跟踪边界, 直到整个对象被描述。 我们展示了第一个深度强化的语义图像分割学方法, 叫做深外线, 它在Coco val2017数据集中中中中中大型人类的 Coco 探测领先板上, 超越了其他算法。 同时, 它通过强化计算机视觉问题学习, 提供了对分化和征服方式的洞察力。