The way features propagate in Fully Convolutional Networks is of momentous importance to capture multi-scale contexts for obtaining precise segmentation masks. This paper proposes a novel series-parallel hybrid paradigm called the Chained Context Aggregation Module (CAM) to diversify feature propagation. CAM gains features of various spatial scales through chain-connected ladder-style information flows and fuses them in a two-stage process, namely pre-fusion and re-fusion. The serial flow continuously increases receptive fields of output neurons and those in parallel encode different region-based contexts. Each information flow is a shallow encoder-decoder with appropriate down-sampling scales to sufficiently capture contextual information. We further adopt an attention model in CAM to guide feature re-fusion. Based on these developments, we construct the Chained Context Aggregation Network (CANet), which employs an asymmetric decoder to recover precise spatial details of prediction maps. We conduct extensive experiments on six challenging datasets, including Pascal VOC 2012, Pascal Context, Cityscapes, CamVid, SUN-RGBD and GATECH. Results evidence that CANet achieves state-of-the-art performance.
翻译:在全面革命网络中传播特征的方式对于捕捉获得精确分离面罩的多尺度环境至关重要。本文件提出一个名为链环环境聚合模块(CAM)的新式系列平行混合模式,使特征传播多样化。CAM通过连锁的阶梯式信息流动获得各种空间尺度的特征,并将这些特征结合到一个两个阶段的过程,即聚合前和再融合过程中。序列流不断增加输出神经元的可接收领域和平行编码不同区域背景的可接受领域。每种信息流动都是一个浅层的编码器分解器,具有适当的下取样尺度,足以捕捉背景信息。我们在CAM中进一步采用关注模型来指导特征的再融合。基于这些发展,我们建立了链环环环聚合网络(CANet),它使用不对称的解码器来恢复预测地图的准确空间细节。我们在六个具有挑战性的数据集上进行了广泛的实验,包括Pascal VOC 2012、Pascal背景、城市景、CamVid、SUN-RGBD和GATECH。