InceptionNeXt: 当 Inception 遇见 ConvNeXt (InceptionNeXt: When Inception Meets ConvNeXt) - 专知论文

会员服务 ·

0

ConvNeXt · 卷积核 · 卷积 · 精度 · 高吞吐量 ·

2023 年 3 月 29 日

InceptionNeXt: When Inception Meets ConvNeXt

翻译：InceptionNeXt: 当 Inception 遇见 ConvNeXt

Weihao Yu,Pan Zhou,Shuicheng Yan,Xinchao Wang

from arxiv, Code: https://github.com/sail-sg/inceptionnext

Inspired by the long-range modeling ability of ViTs, large-kernel convolutions are widely studied and adopted recently to enlarge the receptive field and improve model performance, like the remarkable work ConvNeXt which employs 7x7 depthwise convolution. Although such depthwise operator only consumes a few FLOPs, it largely harms the model efficiency on powerful computing devices due to the high memory access costs. For example, ConvNeXt-T has similar FLOPs with ResNet-50 but only achieves 60% throughputs when trained on A100 GPUs with full precision. Although reducing the kernel size of ConvNeXt can improve speed, it results in significant performance degradation. It is still unclear how to speed up large-kernel-based CNN models while preserving their performance. To tackle this issue, inspired by Inceptions, we propose to decompose large-kernel depthwise convolution into four parallel branches along channel dimension, i.e. small square kernel, two orthogonal band kernels, and an identity mapping. With this new Inception depthwise convolution, we build a series of networks, namely IncepitonNeXt, which not only enjoy high throughputs but also maintain competitive performance. For instance, InceptionNeXt-T achieves 1.6x higher training throughputs than ConvNeX-T, as well as attains 0.2% top-1 accuracy improvement on ImageNet-1K. We anticipate InceptionNeXt can serve as an economical baseline for future architecture design to reduce carbon footprint. Code is available at https://github.com/sail-sg/inceptionnext.

翻译：灵感来自于 ViTs 对长序列建模的能力，最近广泛研究采用大卷积核来扩大感受野、提高模型性能，如 ConvNeXt 在深度 7x7 卷积中采用此方法取得了显著效果。虽然此类 Depthwise Operator 只消耗了很少的 FLOP，但由于高昂的内存访问成本，它在强大的计算设备上大大损害了模型的效率，例如在 A100 GPU 上进行全精度训练时，ConvNeXt-T 的 FLOPs 与 ResNet-50 相似，但只能达到 60% 的吞吐量。虽然减小 ConvNeXt 的 kernel size 可以提高速度，但它会导致显著的性能下降。如何在保持性能的情况下加速基于大卷积核的 CNN 模型仍然不清楚。为了解决这个问题，灵感来自于 Inception，我们提出将大卷积核 Depthwise 卷积沿通道维度分解为四个平行分支，即小方形卷积核、两个正交带状卷积核和一个恒等映射。基于这种新型的 Inception Depthwise 卷积，我们构建了一系列网络，即 InceptionNeXt。它不仅具有高吞吐量，而且保持了竞争性能。例如，InceptionNeXt-T 的训练吞吐量比 ConvNeX-T 高 1.6 倍，并在 ImageNet-1K 上取得了 0.2% 的 top-1 精度改进。我们期待 InceptionNeXt 可以作为未来架构设计的经济基准，以减少碳足迹。代码可在 https://github.com/sail-sg/inceptionnext 找到。

0

相关内容

ConvNeXt

Transformer 落地出现 | Next-ViT实现工业TensorRT实时落地，超越ResNet、CSWin

Transformer 落地出现 | Next-ViT实现工业TensorRT实时落地，超越ResNet、CSWin

专知会员服务

22+阅读 · 2022年7月19日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【ICLR2022】UniFormer：无缝集成 Transformer，更高效的时空表征学习框架

【ICLR2022】UniFormer：无缝集成 Transformer，更高效的时空表征学习框架

专知会员服务

50+阅读 · 2022年2月16日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

专知会员服务

27+阅读 · 2019年11月24日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Transformer工业落地！Next-ViT实现工业TensorRT实时落地，超越ResNet、CSWin

Transformer工业落地！Next-ViT实现工业TensorRT实时落地，超越ResNet、CSWin

极市平台

0+阅读 · 2022年7月17日

EdgeNeXt打出了一套混合拳：集CNN与Transformer于一体的轻量级架构

EdgeNeXt打出了一套混合拳：集CNN与Transformer于一体的轻量级架构

极市平台

0+阅读 · 2022年7月1日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

为什么有些模型FLOPs很低，推理速度却很慢？

为什么有些模型FLOPs很低，推理速度却很慢？

极市平台

14+阅读 · 2020年4月27日

最强NLP预训练模型库PyTorch-Transformers正式开源！支持6个预训练框架，27个预训练模型

最强NLP预训练模型库PyTorch-Transformers正式开源！支持6个预训练框架，27个预训练模型

AI前线

12+阅读 · 2019年7月22日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

新型GFRP节能保温复合墙面板力学性能及破坏机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

AMPK-Beclin-1/Vps34通路在维生素D3（Vit D)诱导足细胞自噬中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

面向协同的设计重用启发模型

国家自然科学基金

0+阅读 · 2013年12月31日

基于非贵金属内核-PtRu合金壳层的核壳结构催化剂抗中毒性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向抑制法尼基转移酶通过PD-L1/PD-1信号通路调控脓毒症小鼠机体免疫功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

考虑疲劳损伤影响的聚丙烯纤维再生骨料混凝土抗冻耐久性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于NF-κB信号通路研究vaspin与leptin在骨性关节炎中的拮抗作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

3D编织复合材料整体加筋壁板结构稳定性研究

国家自然科学基金

0+阅读 · 2013年12月31日

LTE-Advanced中继网络关键技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

Robust Quantum Controllers: Quantum Information -- Thermodynamic Hidden Force Control in Intelligent Robotics based on Quantum Soft Computing

Arxiv

0+阅读 · 2023年5月18日

LMEye: An Interactive Perception Network for Large Language Models

Arxiv

0+阅读 · 2023年5月18日

HFLIC: Human Friendly Perceptual Learned Image Compression with Reinforced Transform

Arxiv

0+阅读 · 2023年5月18日

Impact of ROS 2 Node Composition in Robotic Systems

Arxiv

0+阅读 · 2023年5月17日

Learning Continuous Control Policies for Information-Theoretic Active Perception

Arxiv

0+阅读 · 2023年5月16日

Evaluation Strategy of Time-series Anomaly Detection with Decay Function

Arxiv

0+阅读 · 2023年5月15日

Enabling Deep Learning on Edge Devices

Arxiv

19+阅读 · 2022年10月6日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Arxiv

11+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

相关VIP内容

Transformer 落地出现 | Next-ViT实现工业TensorRT实时落地，超越ResNet、CSWin

Transformer 落地出现 | Next-ViT实现工业TensorRT实时落地，超越ResNet、CSWin

专知会员服务

22+阅读 · 2022年7月19日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【ICLR2022】UniFormer：无缝集成 Transformer，更高效的时空表征学习框架

【ICLR2022】UniFormer：无缝集成 Transformer，更高效的时空表征学习框架

专知会员服务

50+阅读 · 2022年2月16日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

专知会员服务

27+阅读 · 2019年11月24日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《驻地训练手册》美陆军最新72页

《量子隧穿认知神经网络在军民车辆识别与情感分析中的应用》最新论文

俄罗斯对乌克兰无人机作战的战略适应性分析

《美国海岸警卫队2028部队设计执行计划摘要》最新32页

相关资讯

Transformer工业落地！Next-ViT实现工业TensorRT实时落地，超越ResNet、CSWin

Transformer工业落地！Next-ViT实现工业TensorRT实时落地，超越ResNet、CSWin

极市平台

0+阅读 · 2022年7月17日

EdgeNeXt打出了一套混合拳：集CNN与Transformer于一体的轻量级架构

EdgeNeXt打出了一套混合拳：集CNN与Transformer于一体的轻量级架构

极市平台

0+阅读 · 2022年7月1日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

为什么有些模型FLOPs很低，推理速度却很慢？

为什么有些模型FLOPs很低，推理速度却很慢？

极市平台

14+阅读 · 2020年4月27日

最强NLP预训练模型库PyTorch-Transformers正式开源！支持6个预训练框架，27个预训练模型

最强NLP预训练模型库PyTorch-Transformers正式开源！支持6个预训练框架，27个预训练模型

AI前线

12+阅读 · 2019年7月22日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Robust Quantum Controllers: Quantum Information -- Thermodynamic Hidden Force Control in Intelligent Robotics based on Quantum Soft Computing

Arxiv

0+阅读 · 2023年5月18日

LMEye: An Interactive Perception Network for Large Language Models

Arxiv

0+阅读 · 2023年5月18日

HFLIC: Human Friendly Perceptual Learned Image Compression with Reinforced Transform

Arxiv

0+阅读 · 2023年5月18日

Impact of ROS 2 Node Composition in Robotic Systems

Arxiv

0+阅读 · 2023年5月17日

Learning Continuous Control Policies for Information-Theoretic Active Perception

Arxiv

0+阅读 · 2023年5月16日

Evaluation Strategy of Time-series Anomaly Detection with Decay Function

Arxiv

0+阅读 · 2023年5月15日

Enabling Deep Learning on Edge Devices

Arxiv

19+阅读 · 2022年10月6日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Arxiv

11+阅读 · 2018年1月11日

相关基金

新型GFRP节能保温复合墙面板力学性能及破坏机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

AMPK-Beclin-1/Vps34通路在维生素D3（Vit D)诱导足细胞自噬中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

面向协同的设计重用启发模型

国家自然科学基金

0+阅读 · 2013年12月31日

基于非贵金属内核-PtRu合金壳层的核壳结构催化剂抗中毒性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向抑制法尼基转移酶通过PD-L1/PD-1信号通路调控脓毒症小鼠机体免疫功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

考虑疲劳损伤影响的聚丙烯纤维再生骨料混凝土抗冻耐久性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于NF-κB信号通路研究vaspin与leptin在骨性关节炎中的拮抗作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

3D编织复合材料整体加筋壁板结构稳定性研究

国家自然科学基金

0+阅读 · 2013年12月31日

LTE-Advanced中继网络关键技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员