FQ-VT:培训后充分量化愿景变异器量化 (FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer) - 专知论文

会员服务 ·

0

Vision · 可约的 · 推断 · 变换 · 模型评估 ·

2023 年 2 月 17 日

FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

翻译：FQ-VT:培训后充分量化愿景变异器量化

Yang Lin,Tianyu Zhang,Peiqin Sun,Zheng Li,Shuchang Zhou

from arxiv, Accepted by IJCAI 2022

Network quantization significantly reduces model inference complexity and has been widely used in real-world deployments. However, most existing quantization methods have been developed mainly on Convolutional Neural Networks (CNNs), and suffer severe degradation when applied to fully quantized vision transformers. In this work, we demonstrate that many of these difficulties arise because of serious inter-channel variation in LayerNorm inputs, and present, Power-of-Two Factor (PTF), a systematic method to reduce the performance degradation and inference complexity of fully quantized vision transformers. In addition, observing an extreme non-uniform distribution in attention maps, we propose Log-Int-Softmax (LIS) to sustain that and simplify inference by using 4-bit quantization and the BitShift operator. Comprehensive experiments on various transformer-based architectures and benchmarks show that our Fully Quantized Vision Transformer (FQ-ViT) outperforms previous works while even using lower bit-width on attention maps. For instance, we reach 84.89% top-1 accuracy with ViT-L on ImageNet and 50.8 mAP with Cascade Mask R-CNN (Swin-S) on COCO. To our knowledge, we are the first to achieve lossless accuracy degradation (~1%) on fully quantized vision transformers. The code is available at https://github.com/megvii-research/FQ-ViT.

翻译：网络孔化显著降低了模型推断复杂性,并被广泛用于现实世界的部署。然而,大多数现有量化方法主要是在进化神经网络上开发的,在对全面量化的视觉变压器应用时会发生严重退化。在这项工作中,我们表明,许多这些困难是由于以下因素造成的:层内输入的管道严重互换,以及目前的二元化二元化变压器(PTF),这是降低性能退化和充分量化的视觉变压器复杂性的系统方法。此外,在关注地图中观测极端非统一分布,我们建议对神经神经网络进行Log-Int-Softmax(LIS),以维持这一分布,并通过使用四位四位四位四位四位四分和Bit Shift操作器简化推断。关于各种基于变压器的架构和基准的综合实验表明,我们完全量化的愿景变压器(FQ-ViT)比特(FT)比以往的工程要优得多,而在关注地图上甚至使用低位维维维维特的地图上,例如,我们达到了84.89%的顶级-一级-一级-一级变压-一级的精确,在VASS-S-L的图像网络上实现了-MA-MA-MAS-S-S-S-S-S-de-de-de-LO1上实现了。

0

相关内容

Vision

【2022新书】理解深度学习，203页pdf，巴斯大学教授Simon J.D. Prince撰著

【2022新书】理解深度学习，203页pdf，巴斯大学教授Simon J.D. Prince撰著

专知会员服务

153+阅读 · 2022年8月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

双激励耦合与悬臂式弹性支承影响的高速高比压滑动轴承系统润滑和动力学分析

国家自然科学基金

1+阅读 · 2015年12月31日

带有执行器非线性的不确定非线性系统的自适应控制

国家自然科学基金

0+阅读 · 2014年12月31日

Triptolide诱导c-FLIP选择性剪切在调控TRAIL耐药胰腺癌细胞凋亡中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

in silico生物分子网络动力学参数高速与高精度自动化估计的研究

国家自然科学基金

0+阅读 · 2013年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

云计算环境下数据中心的power capping关键问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

输出反馈预测控制的综合方法

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

可证安全代理密码系统研究

国家自然科学基金

0+阅读 · 2009年12月31日

有界噪声激励下非线性系统的全局动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

RED-PSM: Regularization by Denoising of Partially Separable Models for Dynamic Imaging

Arxiv

0+阅读 · 2023年4月7日

SVFormer: Semi-supervised Video Transformer for Action Recognition

Arxiv

0+阅读 · 2023年4月6日

VindLU: A Recipe for Effective Video-and-Language Pretraining

Arxiv

0+阅读 · 2023年4月5日

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Arxiv

0+阅读 · 2023年4月5日

Training Strategies for Vision Transformers for Object Detection

Arxiv

0+阅读 · 2023年4月5日

Undivided Attention: Are Intermediate Layers Necessary for BERT?

Arxiv

0+阅读 · 2023年4月4日

Spatiotemporal and Semantic Zero-inflated Urban Anomaly Prediction

Arxiv

0+阅读 · 2023年4月4日

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

Arxiv

0+阅读 · 2023年4月3日

Full Stack Optimization of Transformer Inference: a Survey

Arxiv

19+阅读 · 2023年2月27日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】理解深度学习，203页pdf，巴斯大学教授Simon J.D. Prince撰著

【2022新书】理解深度学习，203页pdf，巴斯大学教授Simon J.D. Prince撰著

专知会员服务

153+阅读 · 2022年8月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

乌克兰太空研究（2022-2024年） | 176页

【CMU博士论文】大型语言模型的隐性特性

国防领域人工智能走向何方？

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

RED-PSM: Regularization by Denoising of Partially Separable Models for Dynamic Imaging

Arxiv

0+阅读 · 2023年4月7日

SVFormer: Semi-supervised Video Transformer for Action Recognition

Arxiv

0+阅读 · 2023年4月6日

VindLU: A Recipe for Effective Video-and-Language Pretraining

Arxiv

0+阅读 · 2023年4月5日

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Arxiv

0+阅读 · 2023年4月5日

Training Strategies for Vision Transformers for Object Detection

Arxiv

0+阅读 · 2023年4月5日

Undivided Attention: Are Intermediate Layers Necessary for BERT?

Arxiv

0+阅读 · 2023年4月4日

Spatiotemporal and Semantic Zero-inflated Urban Anomaly Prediction

Arxiv

0+阅读 · 2023年4月4日

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

Arxiv

0+阅读 · 2023年4月3日

Full Stack Optimization of Transformer Inference: a Survey

Arxiv

19+阅读 · 2023年2月27日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

相关基金

双激励耦合与悬臂式弹性支承影响的高速高比压滑动轴承系统润滑和动力学分析

国家自然科学基金

1+阅读 · 2015年12月31日

带有执行器非线性的不确定非线性系统的自适应控制

国家自然科学基金

0+阅读 · 2014年12月31日

Triptolide诱导c-FLIP选择性剪切在调控TRAIL耐药胰腺癌细胞凋亡中的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

in silico生物分子网络动力学参数高速与高精度自动化估计的研究

国家自然科学基金

0+阅读 · 2013年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

云计算环境下数据中心的power capping关键问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

输出反馈预测控制的综合方法

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

可证安全代理密码系统研究

国家自然科学基金

0+阅读 · 2009年12月31日

有界噪声激励下非线性系统的全局动力学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员