We propose an end-to-end trainable architecture for simultaneous semantic and instance segmentation (a.k.a. panoptic segmentation) consisting of a convolutional neural network and an asymmetric multiway cut problem solver. The latter solves a combinatorial optimization problem that elegantly incorporates semantic and boundary predictions to produce a panoptic labeling. Our formulation allows to directly maximize a smooth surrogate of the panoptic quality metric by backpropagating the gradient through the optimization problem. Experimental evaluation shows improvement of end-to-end learning w.r.t. comparable approaches on Cityscapes and COCO datasets. Overall, our approach shows the utility of using combinatorial optimization in tandem with deep learning in a challenging large scale real-world problem and showcases benefits and insights into training such an architecture end-to-end.
翻译:我们提出一个端到端可培训的同步语义和实例分割结构(a.k.a.panopic sectionation),其中包括一个革命性神经网络和一个不对称的多路截断问题解答器。后者解决了组合优化问题,它优雅地结合了语义和边界预测来制作一个全光标签。我们的配方能够通过通过优化问题对梯度进行反向转换,直接最大限度地实现全光质量指标的平稳替代。实验评估显示,在城市景景和COCO数据集方面,终端到终端学习方法的改进。总体而言,我们的方法表明,在应对大规模现实世界问题进行深刻学习的同时,使用组合优化与深层次学习的效用,并展示了培训这种结构的端到端的效益和洞察力。