Integrating high-level context information with low-level details is of central importance in semantic segmentation. Towards this end, most existing segmentation models apply bilinear up-sampling and convolutions to feature maps of different scales, and then align them at the same resolution. However, bilinear up-sampling blurs the precise information learned in these feature maps and convolutions incur extra computation costs. To address these issues, we propose the Implicit Feature Alignment function (IFA). Our method is inspired by the rapidly expanding topic of implicit neural representations, where coordinate-based neural networks are used to designate fields of signals. In IFA, feature vectors are viewed as representing a 2D field of information. Given a query coordinate, nearby feature vectors with their relative coordinates are taken from the multi-level feature maps and then fed into an MLP to generate the corresponding output. As such, IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions. We demonstrate the efficacy of IFA on multiple datasets, including Cityscapes, PASCAL Context, and ADE20K. Our method can be combined with improvement on various architectures, and it achieves state-of-the-art computation-accuracy trade-off on common benchmarks. Code will be made available at https://github.com/hzhupku/IFA.
翻译:将高层次背景信息与低层细节整合在一起,在语义分解中具有核心重要性。为此,大多数现有分解模型都采用双线上标和相变模型,以显示不同比例尺的地图,然后在同一项决议中加以调整。然而,双线上标模糊了在这些特征图和相变中学到的准确信息,从而产生额外的计算费用。为了解决这些问题,我们提议隐含地貌协调功能(IFA) 。我们的方法来自迅速扩大的隐含神经表层专题,其中使用基于协调的神经网络来指定信号领域。在IFA中,特征矢量被视为代表2D信息领域。根据查询坐标,附近带有相对坐标的特性矢量取自多层次地图,然后输入MLP,以产生相应的产出。因此,IFA隐含地将地图与不同级别的特征图相匹配,并能够在任意决议中绘制分解图。我们展示了IFA在多个数据集上的功效,包括城景、PASAL-A背景/AADE20,在通用的架构上可以实现。