This paper proposes a novel heterogeneous grid convolution that builds a graph-based image representation by exploiting heterogeneity in the image content, enabling adaptive, efficient, and controllable computations in a convolutional architecture. More concretely, the approach builds a data-adaptive graph structure from a convolutional layer by a differentiable clustering method, pools features to the graph, performs a novel direction-aware graph convolution, and unpool features back to the convolutional layer. By using the developed module, the paper proposes heterogeneous grid convolutional networks, highly efficient yet strong extension of existing architectures. We have evaluated the proposed approach on four image understanding tasks, semantic segmentation, object localization, road extraction, and salient object detection. The proposed method is effective on three of the four tasks. Especially, the method outperforms a strong baseline with more than 90% reduction in floating-point operations for semantic segmentation, and achieves the state-of-the-art result for road extraction. We will share our code, model, and data.
翻译:本文建议采用一种新型的多样化电网变异,通过利用图像内容的异质性,建立基于图形的图像图示,在变异结构中进行适应、高效和可控的计算。更具体地说,该方法通过一种不同的组合方法,从变异层构建一个数据适应性图形结构,将图的特征集合到图层,进行新的方向-觉变异,并进行一种新颖的方向-觉变异,以及将特征重新回到变异层。通过使用开发的模块,本文提出了混杂的电网共变网络,高效而又有力地扩展现有结构。我们评估了四种图像理解任务的拟议方法,即语义分割、目标本地化、道路提取和突出对象探测。拟议方法对四种任务中的三项有效。特别是,该方法超越了一个强大的基线,在静态分解的浮点操作上减少了90%以上,并实现了道路提取的状态。我们将分享我们的代码、模型和数据。