Light-weight convolutional neural networks (CNNs) are specially designed for applications on mobile devices with faster inference speed. The convolutional operation can only capture local information in a window region, which prevents performance from being further improved. Introducing self-attention into convolution can capture global information well, but it will largely encumber the actual speed. In this paper, we propose a hardware-friendly attention mechanism (dubbed DFC attention) and then present a new GhostNetV2 architecture for mobile applications. The proposed DFC attention is constructed based on fully-connected layers, which can not only execute fast on common hardware but also capture the dependence between long-range pixels. We further revisit the expressiveness bottleneck in previous GhostNet and propose to enhance expanded features produced by cheap operations with DFC attention, so that a GhostNetV2 block can aggregate local and long-range information simultaneously. Extensive experiments demonstrate the superiority of GhostNetV2 over existing architectures. For example, it achieves 75.3% top-1 accuracy on ImageNet with 167M FLOPs, significantly suppressing GhostNetV1 (74.5%) with a similar computational cost. The source code will be available at https://github.com/huawei-noah/Efficient-AI-Backbones/tree/master/ghostnetv2_pytorch and https://gitee.com/mindspore/models/tree/master/research/cv/ghostnetv2.
翻译:轻量级神经神经网络(CNNs)是专门为移动设备上的应用而设计的,其发酵速度更快。 共变操作只能捕捉窗口区域的地方信息, 从而无法进一步提高性能。 将自我注意引入革命可以捕捉全球信息, 但它将在很大程度上抑制实际速度。 在本文中, 我们提议了一个硬件友好关注机制( 低沉的 DFC 关注), 然后为移动应用程序推出一个新的 GhostNetV2 结构。 拟议的 DFC 关注建立在完全连接的层之上, 不仅能够快速执行通用硬件, 还能捕捉远程像素之间的依赖性。 我们进一步重新审视GhostNet网络中的直观性瓶颈, 并提议在DFC 关注下加强廉价操作产生的扩大功能, 这样GhostNetV2 块可以同时汇总本地和远程信息。 广泛的实验显示GhostNetV2 在现有结构中的优势。 例如, 它在图像网络上实现了75. 3% 最高-1 的精确度, 包括167M FLOPs, 大力抑制GhindnetVral/s/strealemal_ http/ commaxalemalalal_ / commaxal_ / commal_ commalationalb/ commationalb/ commalb/ commalbs.