适应性二进制 (Adaptive Binary-Ternary Quantization) - 专知论文

会员服务 ·

0

binary · 模型评估 · MoDELS · Networking · ONCE ·

2021 年 9 月 9 日

Adaptive Binary-Ternary Quantization

翻译：适应性二进制

Grégoire Morin,Ryan Razani,Vahid Partovi Nia,Eyyüb Sari

Neural network models are resource hungry. It is difficult to deploy such deep networks on devices with limited resources, like smart wearables, cellphones, drones, and autonomous vehicles. Low bit quantization such as binary and ternary quantization is a common approach to alleviate this resource requirements. Ternary quantization provides a more flexible model and outperforms binary quantization in terms of accuracy, however doubles the memory footprint and increases the computational cost. Contrary to these approaches, mixed quantized models allow a trade-off between accuracy and memory footprint. In such models, quantization depth is often chosen manually, or is tuned using a separate optimization routine. The latter requires training a quantized network multiple times. Here, we propose an adaptive combination of binary and ternary quantization, namely Smart Quantization (SQ), in which the quantization depth is modified directly via a regularization function, so that the model is trained only once. Our experimental results show that the proposed method adapts quantization depth successfully while keeping the model accuracy high on MNIST and CIFAR10 benchmarks.

翻译：智能磨损器、手机、无人驾驶飞机和自主车辆等资源有限的装置上很难部署这种深层次网络。二进制和四进制等低位量化是缓解资源需求的共同办法。在精确度方面,Ternary 量化提供了一个更灵活的模型,优于二进制量化,但将记忆足迹翻一番,并增加了计算成本。与这些方法相反,混合定量化模型允许在精确度和记忆足迹之间进行权衡。在这类模型中,量化深度往往是手工选择的,或者使用单独的优化常规来调整。后者需要多次培训一个四进制网络。在这里,我们提议将二进制和四进制量化(SQ)进行适应性组合,即智能量化(SQQ),通过正规化功能直接修改四进制深度,从而只对模型进行过一次培训。我们的实验结果表明,拟议方法在使模型的精确度深度上保持高MNIST和CIFAR10基准,同时成功地调整了量化深度。

0

相关内容

binary

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【ECCV2020】自动化所&QMUL联合发布 light-reid：首个轻量化行人重识别开源工具箱！

【ECCV2020】自动化所&QMUL联合发布 light-reid：首个轻量化行人重识别开源工具箱！

专知会员服务

16+阅读 · 2020年8月28日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020-谷歌】多目标(车辆)跟踪与检测框架 RetinaTrack

【CVPR2020-谷歌】多目标(车辆)跟踪与检测框架 RetinaTrack

专知会员服务

45+阅读 · 2020年4月10日

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

专知会员服务

53+阅读 · 2020年4月8日

【CVPR2020-清华大学】分辨率自适应网络的有效推理，Resolution Adaptive Networks

【CVPR2020-清华大学】分辨率自适应网络的有效推理，Resolution Adaptive Networks

专知会员服务

22+阅读 · 2020年3月17日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

95 FPS！超快速3D目标检测网络开源了！SFA3D：基于LiDAR的实时、准确的3D目标检测模型

95 FPS！超快速3D目标检测网络开源了！SFA3D：基于LiDAR的实时、准确的3D目标检测模型

CVer

4+阅读 · 2020年11月14日

BERT 瘦身之路：Distillation，Quantization，Pruning

BERT 瘦身之路：Distillation，Quantization，Pruning

AINLP

10+阅读 · 2019年10月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

【关关的刷题日记54】Leetcode 226. Invert Binary Tree

【关关的刷题日记54】Leetcode 226. Invert Binary Tree

专知

6+阅读 · 2017年12月2日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

无人驾驶汽车

无人驾驶汽车

劲说

6+阅读 · 2016年8月26日

Post-Training Sparsity-Aware Quantization

Arxiv

0+阅读 · 2021年10月28日

AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition

Arxiv

4+阅读 · 2021年5月12日

Zero-shot Adversarial Quantization

Arxiv

6+阅读 · 2021年3月30日

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Arxiv

8+阅读 · 2020年10月9日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

Binarized Knowledge Graph Embeddings

Arxiv

4+阅读 · 2019年2月8日

Adaptive Neural Trees

Adaptive Neural Trees

Arxiv

4+阅读 · 2018年12月10日

HAQ: Hardware-Aware Automated Quantization

HAQ: Hardware-Aware Automated Quantization

Arxiv

6+阅读 · 2018年11月21日

Sample Efficient Adaptive Text-to-Speech

Arxiv

7+阅读 · 2018年9月27日

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Arxiv

5+阅读 · 2018年9月13日

VIP会员

文章信息

相关主题

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【ECCV2020】自动化所&QMUL联合发布 light-reid：首个轻量化行人重识别开源工具箱！

【ECCV2020】自动化所&QMUL联合发布 light-reid：首个轻量化行人重识别开源工具箱！

专知会员服务

16+阅读 · 2020年8月28日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020-谷歌】多目标(车辆)跟踪与检测框架 RetinaTrack

【CVPR2020-谷歌】多目标(车辆)跟踪与检测框架 RetinaTrack

专知会员服务

45+阅读 · 2020年4月10日

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

专知会员服务

53+阅读 · 2020年4月8日

【CVPR2020-清华大学】分辨率自适应网络的有效推理，Resolution Adaptive Networks

【CVPR2020-清华大学】分辨率自适应网络的有效推理，Resolution Adaptive Networks

专知会员服务

22+阅读 · 2020年3月17日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICCV2025】基于奖励引导解码的多模态大语言模型控制

【CMU博士论文】基于深度学习的高效贝叶斯实验设计

《数据安全国家标准体系（2025版）》征求意见稿

2025年中国AI算力基础设施发展趋势洞察

相关资讯

95 FPS！超快速3D目标检测网络开源了！SFA3D：基于LiDAR的实时、准确的3D目标检测模型

95 FPS！超快速3D目标检测网络开源了！SFA3D：基于LiDAR的实时、准确的3D目标检测模型

CVer

4+阅读 · 2020年11月14日

BERT 瘦身之路：Distillation，Quantization，Pruning

BERT 瘦身之路：Distillation，Quantization，Pruning

AINLP

10+阅读 · 2019年10月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

【关关的刷题日记54】Leetcode 226. Invert Binary Tree

【关关的刷题日记54】Leetcode 226. Invert Binary Tree

专知

6+阅读 · 2017年12月2日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

无人驾驶汽车

无人驾驶汽车

劲说

6+阅读 · 2016年8月26日

相关论文

Post-Training Sparsity-Aware Quantization

Arxiv

0+阅读 · 2021年10月28日

AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition

Arxiv

4+阅读 · 2021年5月12日

Zero-shot Adversarial Quantization

Arxiv

6+阅读 · 2021年3月30日

DynaBERT: Dynamic BERT with Adaptive Width and Depth

Arxiv

8+阅读 · 2020年10月9日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

Binarized Knowledge Graph Embeddings

Arxiv

4+阅读 · 2019年2月8日

Adaptive Neural Trees

Adaptive Neural Trees

Arxiv

4+阅读 · 2018年12月10日

HAQ: Hardware-Aware Automated Quantization

HAQ: Hardware-Aware Automated Quantization

Arxiv

6+阅读 · 2018年11月21日

Sample Efficient Adaptive Text-to-Speech

Arxiv

7+阅读 · 2018年9月27日

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Quantization Mimic: Towards Very Tiny CNN for Object Detection

Arxiv

5+阅读 · 2018年9月13日

微信扫码咨询专知VIP会员