快跑，别走路：追求更高的FLOPS加速神经网络 (Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks) - 专知论文

会员服务 ·

0

精度 · 神经网络 · 冗余计算 · 内存 · 卷积 ·

2023 年 4 月 4 日

Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks

翻译：快跑，别走路：追求更高的FLOPS加速神经网络

Jierun Chen,Shiu-hong Kao,Hao He,Weipeng Zhuo,Song Wen,Chul-Ho Lee,S. -H. Gary Chan

from arxiv, Accepted to CVPR 2023

To design fast neural networks, many works have been focusing on reducing the number of floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does not necessarily lead to a similar level of reduction in latency. This mainly stems from inefficiently low floating-point operations per second (FLOPS). To achieve faster networks, we revisit popular operators and demonstrate that such low FLOPS is mainly due to frequent memory access of the operators, especially the depthwise convolution. We hence propose a novel partial convolution (PConv) that extracts spatial features more efficiently, by cutting down redundant computation and memory access simultaneously. Building upon our PConv, we further propose FasterNet, a new family of neural networks, which attains substantially higher running speed than others on a wide range of devices, without compromising on accuracy for various vision tasks. For example, on ImageNet-1k, our tiny FasterNet-T0 is $2.8\times$, $3.3\times$, and $2.4\times$ faster than MobileViT-XXS on GPU, CPU, and ARM processors, respectively, while being $2.9\%$ more accurate. Our large FasterNet-L achieves impressive $83.5\%$ top-1 accuracy, on par with the emerging Swin-B, while having $36\%$ higher inference throughput on GPU, as well as saving $37\%$ compute time on CPU. Code is available at \url{https://github.com/JierunChen/FasterNet}.

翻译：为了设计快速的神经网络，很多研究一直致力于减少浮点运算（FLOPs）的数量。然而，我们发现这种FLOPs的减少并不一定会导致类似水平的延迟缩短。这主要是由于低效的每秒浮点运算量（FLOPS）。为了实现更快的网络，我们重新审视了流行的运算符，并展示了这种低FLOPS主要源于运算符的频繁内存访问，尤其是深度卷积。因此，我们提出了一种新颖的部分卷积（PConv），通过同时减少冗余计算和内存访问来更有效地提取空间特征。在此基础上，我们进一步提出了FasterNet，一种新型的神经网络系列，在各种视觉任务上实现了比其他网络更高的运行速度，而不损失精度。例如，在ImageNet-1k上，我们微型的FasterNet-T0比MobileViT-XXS在GPU、CPU和ARM处理器上分别快2.8倍、3.3倍和2.4倍，而精度却高出2.9%。我们的大型FasterNet-L实现了令人印象深刻的83.5%的top-1精度，与新兴的Swin-B持平，同时在GPU上拥有36%的更高推断吞吐量，并节省了CPU的37%的计算时间。代码可在\url{https://github.com/JierunChen/FasterNet}上找到。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

汽车主动悬架系统的多目标自适应控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

GPU程序访存行为分析和优化关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Pictet–Spengler类反应机理的理论研究和新反应设计

国家自然科学基金

0+阅读 · 2013年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

脉冲神经网络的新结构与学习算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于ELAD和RNN的电动车用电动机运行效率快速优化关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

稳定高效的膦手性PCP类Pincer型催化剂的合成及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

储电型柔性有机薄膜太阳电池的基础研究

国家自然科学基金

0+阅读 · 2008年12月31日

A Neural Space-Time Representation for Text-to-Image Personalization

Arxiv

0+阅读 · 2023年5月24日

From Tempered to Benign Overfitting in ReLU Neural Networks

Arxiv

0+阅读 · 2023年5月24日

Utopia: Efficient Address Translation using Hybrid Virtual-to-Physical Address Mapping

Arxiv

0+阅读 · 2023年5月24日

Augmented Random Search for Multi-Objective Bayesian Optimization of Neural Networks

Arxiv

0+阅读 · 2023年5月23日

Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks

Arxiv

0+阅读 · 2023年5月23日

On the Optimal Batch Size for Byzantine-Robust Distributed Learning

Arxiv

0+阅读 · 2023年5月23日

Masked Bayesian Neural Networks : Computation and Optimality

Arxiv

0+阅读 · 2023年5月23日

Survey on Graph Neural Network Acceleration: An Algorithmic Perspective

Arxiv

12+阅读 · 2022年2月10日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Overcoming Catastrophic Forgetting in Graph Neural Networks

Arxiv

14+阅读 · 2020年12月10日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

A Neural Space-Time Representation for Text-to-Image Personalization

Arxiv

0+阅读 · 2023年5月24日

From Tempered to Benign Overfitting in ReLU Neural Networks

Arxiv

0+阅读 · 2023年5月24日

Utopia: Efficient Address Translation using Hybrid Virtual-to-Physical Address Mapping

Arxiv

0+阅读 · 2023年5月24日

Augmented Random Search for Multi-Objective Bayesian Optimization of Neural Networks

Arxiv

0+阅读 · 2023年5月23日

Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks

Arxiv

0+阅读 · 2023年5月23日

On the Optimal Batch Size for Byzantine-Robust Distributed Learning

Arxiv

0+阅读 · 2023年5月23日

Masked Bayesian Neural Networks : Computation and Optimality

Arxiv

0+阅读 · 2023年5月23日

Survey on Graph Neural Network Acceleration: An Algorithmic Perspective

Arxiv

12+阅读 · 2022年2月10日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Overcoming Catastrophic Forgetting in Graph Neural Networks

Arxiv

14+阅读 · 2020年12月10日

相关基金

汽车主动悬架系统的多目标自适应控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

GPU程序访存行为分析和优化关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Pictet–Spengler类反应机理的理论研究和新反应设计

国家自然科学基金

0+阅读 · 2013年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

脉冲神经网络的新结构与学习算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于ELAD和RNN的电动车用电动机运行效率快速优化关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

稳定高效的膦手性PCP类Pincer型催化剂的合成及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

储电型柔性有机薄膜太阳电池的基础研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员