Joint object detection and semantic segmentation can be applied to many fields, such as self-driving cars and unmanned surface vessels. An initial and important progress towards this goal has been achieved by simply sharing the deep convolutional features for the two tasks. However, this simple scheme is unable to make full use of the fact that detection and segmentation are mutually beneficial. To overcome this drawback, we propose a framework called TripleNet where triple supervisions including detection-oriented supervision, class-aware segmentation supervision, and class-agnostic segmentation supervision are imposed on each layer of the decoder network. Class-agnostic segmentation supervision provides an objectness prior knowledge for both semantic segmentation and object detection. Besides the three types of supervisions, two light-weight modules (i.e., inner-connected module and attention skip-layer fusion) are also incorporated into each layer of the decoder. In the proposed framework, detection and segmentation can sufficiently boost each other. Moreover, class-agnostic and class-aware segmentation on each decoder layer are not performed at the test stage. Therefore, no extra computational costs are introduced at the test stage. Experimental results on the VOC2007 and VOC2012 datasets demonstrate that the proposed TripleNet is able to improve both the detection and segmentation accuracies without adding extra computational costs.
翻译:联合物体探测和语义分割可适用于许多领域,如自驾驶汽车和无人表面船等自驾驶汽车和无人表面船。通过简单分享这两项任务的深演分解特征,在实现这一目标方面取得了初步和重要进展。然而,这一简单办法无法充分利用检测和分解是互利的这一事实。为了克服这一缺陷,我们提议了一个称为TripleNet的框架,在这个框架中,三重监督,包括以探测为导向的监督、阶级认知分解监督和等级分解监督,以及分解监督。对于分解网络的每一层,分类和分解监督是初步和重要的。级分解监督为分解和对象检测提供了一种目标性知识。除了三种类型的监督外,两个轻度模块(即与内联模块和注意权重混合)也被纳入了分解器的每一层。在拟议的框架中,检测和分解能够相互促进。此外,每个分解层的分类和分解分解分解系统没有在测试阶段进行对象性分析,因此,在测试阶段不进行实验性计算成本,因此不进行额外的计算结果,在试验阶段进行。