具有混合精度和适应性分辨率的可区别动态量度 (Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution) - 专知论文

会员服务 ·

0

查准率/准确率 · 离散化 · Extensibility · 极小点 · tuning ·

2021 年 7 月 7 日

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution

翻译：具有混合精度和适应性分辨率的可区别动态量度

Zhang Zhaoyang,Shao Wenqi,Gu Jinwei,Wang Xiaogang,Luo Ping

from arxiv, Accepted by ICML 2021

Model quantization is challenging due to many tedious hyper-parameters such as precision (bitwidth), dynamic range (minimum and maximum discrete values) and stepsize (interval between discrete values). Unlike prior arts that carefully tune these values, we present a fully differentiable approach to learn all of them, named Differentiable Dynamic Quantization (DDQ), which has several benefits. (1) DDQ is able to quantize challenging lightweight architectures like MobileNets, where different layers prefer different quantization parameters. (2) DDQ is hardware-friendly and can be easily implemented using low-precision matrix-vector multiplication, making it capable in many hardware such as ARM. (3) Extensive experiments show that DDQ outperforms prior arts on many networks and benchmarks, especially when models are already efficient and compact. e.g., DDQ is the first approach that achieves lossless 4-bit quantization for MobileNetV2 on ImageNet.

翻译：模型量化之所以具有挑战性,是因为许多繁琐的超参数,如精度(比特维特)、动态范围(最小值和最大离散值)和阶梯化(离散值之间的交互值)等。与以往仔细调和这些值的艺术不同,我们提出了一种完全不同的学习方法,称为差异动态量化(DDQ),它有几个好处。 (1) DDQ能够量化具有挑战性的轻量结构,如移动网络,其中不同层次更喜欢不同的量化参数。 (2) DDQ是硬件友好型的,可以使用低精度矩阵-矢量化倍增法轻易实施,使DDQ能够在诸如ARM等许多硬件中发挥作用。 (3) 广泛的实验表明,DDQ在许多网络和基准上超越了先前的艺术,特别是在模型已经高效和紧凑的情况下。例如,DDQ是第一个在图像网络上实现移动网络2无损四位四位量化的方法。

0

相关内容

查准率/准确率

查准率/准确率

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【TPAMI2021】鲁棒可微SVD，Robust Differentiable SVD

专知会员服务

23+阅读 · 2021年4月10日

【ICLR2021】微分动态规划神经优化器

专知会员服务

16+阅读 · 2021年3月4日

【ICML2020】小样本目标检测

【ICML2020】小样本目标检测

专知会员服务

91+阅读 · 2020年6月2日

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

专知会员服务

24+阅读 · 2020年4月13日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【CVPR2020-清华大学】分辨率自适应网络的有效推理，Resolution Adaptive Networks

【CVPR2020-清华大学】分辨率自适应网络的有效推理，Resolution Adaptive Networks

专知会员服务

22+阅读 · 2020年3月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

BERT 瘦身之路：Distillation，Quantization，Pruning

BERT 瘦身之路：Distillation，Quantization，Pruning

AINLP

10+阅读 · 2019年10月22日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

已删除

将门创投

10+阅读 · 2018年5月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

专知

7+阅读 · 2018年4月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Adaptive Binary-Ternary Quantization

Arxiv

0+阅读 · 2021年9月9日

ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs

Arxiv

0+阅读 · 2021年9月9日

Performance, Precision, and Payloads: Adaptive Nonlinear MPC for Quadrotors

Performance, Precision, and Payloads: Adaptive Nonlinear MPC for Quadrotors

Arxiv

0+阅读 · 2021年9月9日

Compression Network with Transformer for Approximate Nearest Neighbor Search

Arxiv

0+阅读 · 2021年9月9日

Resistive Neural Hardware Accelerators

Arxiv

0+阅读 · 2021年9月8日

Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks

Arxiv

0+阅读 · 2021年9月8日

Zero-shot Adversarial Quantization

Arxiv

6+阅读 · 2021年3月30日

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Arxiv

8+阅读 · 2020年10月9日

Resolution Adaptive Networks for Efficient Inference

Arxiv

5+阅读 · 2020年3月16日

Meta-Learning with Differentiable Convex Optimization

Arxiv

5+阅读 · 2019年4月23日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【TPAMI2021】鲁棒可微SVD，Robust Differentiable SVD

专知会员服务

23+阅读 · 2021年4月10日

【ICLR2021】微分动态规划神经优化器

专知会员服务

16+阅读 · 2021年3月4日

【ICML2020】小样本目标检测

【ICML2020】小样本目标检测

专知会员服务

91+阅读 · 2020年6月2日

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

【华为-诺亚实验室】动态BERT, Dynamic BERT with Adaptive Width and Depth

专知会员服务

24+阅读 · 2020年4月13日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【CVPR2020-清华大学】分辨率自适应网络的有效推理，Resolution Adaptive Networks

【CVPR2020-清华大学】分辨率自适应网络的有效推理，Resolution Adaptive Networks

专知会员服务

22+阅读 · 2020年3月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

BERT 瘦身之路：Distillation，Quantization，Pruning

BERT 瘦身之路：Distillation，Quantization，Pruning

AINLP

10+阅读 · 2019年10月22日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

已删除

将门创投

10+阅读 · 2018年5月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

专知

7+阅读 · 2018年4月3日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Adaptive Binary-Ternary Quantization

Arxiv

0+阅读 · 2021年9月9日

ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs

Arxiv

0+阅读 · 2021年9月9日

Performance, Precision, and Payloads: Adaptive Nonlinear MPC for Quadrotors

Performance, Precision, and Payloads: Adaptive Nonlinear MPC for Quadrotors

Arxiv

0+阅读 · 2021年9月9日

Compression Network with Transformer for Approximate Nearest Neighbor Search

Arxiv

0+阅读 · 2021年9月9日

Resistive Neural Hardware Accelerators

Arxiv

0+阅读 · 2021年9月8日

Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks

Arxiv

0+阅读 · 2021年9月8日

Zero-shot Adversarial Quantization

Arxiv

6+阅读 · 2021年3月30日

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Arxiv

8+阅读 · 2020年10月9日

Resolution Adaptive Networks for Efficient Inference

Arxiv

5+阅读 · 2020年3月16日

Meta-Learning with Differentiable Convex Optimization

Arxiv

5+阅读 · 2019年4月23日

微信扫码咨询专知VIP会员