The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at each layer. A broad range of prior research has investigated the spatial component of this relationship, seeking to strengthen the representational power of a CNN by enhancing the quality of spatial encodings throughout its feature hierarchy. In this work, we focus instead on the channel relationship and propose a novel architectural unit, which we term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels. We show that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. We further demonstrate that SE blocks bring significant improvements in performance for existing state-of-the-art CNNs at minimal additional computational cost. Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2.251%, surpassing the winning entry of 2016 by a relative improvement of ~25%. Models and code are available at https://github.com/hujie-frank/SENet.
翻译:在这项工作中,我们把重点放在频道关系上,并提出一个新的建筑单元,我们称之为“挤压和Exprecience”(SE)区块,它使网络能够通过在每一层的当地可接收域内提供空间和频道信息来建立信息功能。我们曾进行了一系列广泛的研究,调查了这种关系的空间部分,力求通过提高一个CNN的特征等级结构的空间编码质量,加强CNN的代表性力量。在这项工作中,我们把重点放在频道关系上,并提出一个新的建筑单元,我们称之为“Squeeze-and-Expregiation”(SE)区块,通过明确模拟各频道之间的相互依存关系,来适应性地重新校正调整频道的功能响应。我们表明,这些区块可以堆叠在一起,形成SENet结构,在不同的数据集中非常有效地普及。我们进一步表明,SEEE区块以最低的额外计算成本为现有状态艺术CNNSNS带来显著的性能改进。Squeze-hument网络构成了我们国际LSVRC 2017年提交文件的基础,将头一个5级错误降低到2.25%,在2016年前SE-SEE/SEASyal可获取的相对代码进入。