Graph-based convolutional model such as non-local block has shown to be effective for strengthening the context modeling ability in convolutional neural networks (CNNs). However, its pixel-wise computational overhead is prohibitive which renders it unsuitable for high resolution imagery. In this paper, we explore the efficiency of context graph reasoning and propose a novel framework called Squeeze Reasoning. Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector and perform reasoning within the single vector where the computation cost can be significantly reduced. Specifically, we build the node graph in the vector where each node represents an abstract semantic concept. The refined feature within the same semantic category results to be consistent, which is thus beneficial for downstream tasks. We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks. {Despite its simplicity and being lightweight, the proposed strategy allows us to establish the considerable results on different semantic segmentation datasets and shows significant improvements with respect to strong baselines on various other scene understanding tasks including object detection, instance segmentation and panoptic segmentation.} Code is available at \url{https://github.com/lxtGH/SFSegNets}.
翻译:非本地区块等基于图形的共变模型显示,对于加强卷发神经网络(CNNs)的背景建模能力是有效的。然而,它的像素逻辑计算管理器是令人望而却步的,因此不适合高分辨率图像。在本文中,我们探索了上下文图推理的效率,并提出了一个名为 Squeeze 理性的新颖框架。我们没有在空间地图上传播信息,而是首先学会将输入功能挤到一个对频道友好的全球矢量中,并在单一矢量中进行推理,计算成本可以大幅降低。具体地说,我们在每个节点代表抽象的语义概念的矢量中构建节点图。同一语义类别中精细化的特性使其不适于高分辨率图像图像。因此,这对下游任务是有益的。我们表明,我们的方法可以模块化为终端到终端训练的区块,并且可以很容易地连接到现有的网络。 {Despiteite to the be light to the sable resultical contractions, 并显示在强大的基线中可以找到的Squal squal subal dal subal sublistrad}