Amodal segmentation in biological vision refers to the perception of the entire object when only a fraction is visible. This ability of seeing through occluders and reasoning about occlusion is innate to biological vision but not adequately modeled in current machine vision approaches. A key challenge is that ground-truth supervisions of amodal object segmentation are inherently difficult to obtain. In this paper, we present a neural network architecture that is capable of amodal perception, when weakly supervised with standard (modal) bounding box annotations. Our model extends compositional convolutional neural networks (CompositionalNets), which have been shown to be robust to partial occlusion by explicitly representing objects as a composition of parts. In particular, we extend CompositionalNets to perform three new vision tasks from bounding box supervision only: 1) Learning compositional shape priors of objects in varying 3D poses from modal bounding box supervision; 2) Predicting instance segmentation by integrating the compositional shape priors into the part-voting mechanism in the CompositionalNets; 3) Predicting amodal completion for both the bounding box and the instance segmentation mask by implementing compositional feature alignment in CompositionalNets. Our extensive experiments show that our proposed model can segment amodal masks robustly, with much improved mask prediction qualities compared to state-of-the-art segmentation approaches.
翻译:在生物视觉中, 模式分割是指在只看到一个分数时对整个对象的感知。 通过表达器和推理来观察整个对象的能力是生物视觉的产物, 而在目前的机器视觉方法中, 并不是完全建模。 一个关键的挑战是, 模式对象分割的地面真实性监督本质上是难以获得的。 在本文中, 我们展示了一个神经网络结构, 当标准( 模式) 绑定框说明监管不力时, 能够显示一个神经网络结构。 我们的模型扩展成构型神经网络( 组合网), 事实证明, 通过明确代表部件的构成来部分封闭性。 特别是, 我们扩展构型网络, 执行三个新的视觉任务, 仅仅从绑绑定的框监督中进行约束性监督。 1) 学习3D 不同对象之前的构成形状, 从模式绑定框监督中形成; 2) 通过将先前的构成形状纳入组成网内部分投票机制来预测实例分割性分化。 3) 预测组合完成部分封闭性网络的完成部分封闭性网络, 通过明确性分析功能, 将我们的拟议组合组合组合化分析性分析性分析性分析性分析式组合, 将显示我们的拟议结构分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析性分析式组合式组合式组合式组合式组合式组合式组合式组合式组合式组合式组合式的功能, 以实施。