With the growing adoption of deep learning for on-device TinyML applications, there has been an ever-increasing demand for more efficient neural network backbones optimized for the edge. Recently, the introduction of attention condenser networks have resulted in low-footprint, highly-efficient, self-attention neural networks that strike a strong balance between accuracy and speed. In this study, we introduce a new faster attention condenser design called double-condensing attention condensers that enable more condensed feature embedding. We further employ a machine-driven design exploration strategy that imposes best practices design constraints for greater efficiency and robustness to produce the macro-micro architecture constructs of the backbone. The resulting backbone (which we name AttendNeXt) achieves significantly higher inference throughput on an embedded ARM processor when compared to several other state-of-the-art efficient backbones (>10X faster than FB-Net C at higher accuracy and speed and >10X faster than MobileOne-S1 at smaller size) while having a small model size (>1.37X smaller than MobileNetv3-L at higher accuracy and speed) and strong accuracy (1.1% higher top-1 accuracy than MobileViT XS on ImageNet at higher speed). These promising results demonstrate that exploring different efficient architecture designs and self-attention mechanisms can lead to interesting new building blocks for TinyML applications.
翻译:随着对精密的微粒ML应用的深入学习的日益采用,人们越来越需要更高效的神经网络主干网,以优化边缘。最近,引入了关注冷凝器网络,导致在精度和速度之间实现高度平衡的低脚、高效率和自省神经网络。在这项研究中,我们引入了一种新的更快的注意冷凝器设计,称为双凝聚式冷凝器,能够更浓缩特性嵌入。我们进一步采用了机器驱动的设计探索战略,对效率更高、更稳健以产生宏观-微型主干网结构结构提出了最佳做法设计限制。由此而来的骨干(我们命名为TENSNXt)在嵌入的ARM处理器上实现了显著更高的脚印、高效和速度之间的平衡。比FB-Net-Net C的精密度和速度更快,比MoliveOne-S1号更快,规模较小。我们还采用了一个小的模型设计设计限制(比移动Net-L号结构更小,以更高的效率和速度,比移动网络-L号更精确,比移动-V3-L号高,以更高的精准度)在嵌入式S-V-V-V1号高级的自动结构中,并展示了强大的自我定位结构中,这些令人有希望的精确度。