利用小型边缘微控制器自动混合低精密度量法 (Leveraging Automated Mixed-Low-Precision Quantization for tiny edge microcontrollers)

The severe on-chip memory limitations are currently preventing the deployment of the most accurate Deep Neural Network (DNN) models on tiny MicroController Units (MCUs), even if leveraging an effective 8-bit quantization scheme. To tackle this issue, in this paper we present an automated mixed-precision quantization flow based on the HAQ framework but tailored for the memory and computational characteristics of MCU devices. Specifically, a Reinforcement Learning agent searches for the best uniform quantization levels, among 2, 4, 8 bits, of individual weight and activation tensors, under the tight constraints on RAM and FLASH embedded memory sizes. We conduct an experimental analysis on MobileNetV1, MobileNetV2 and MNasNet models for Imagenet classification. Concerning the quantization policy search, the RL agent selects quantization policies that maximize the memory utilization. Given an MCU-class memory bound of 2MB for weight-only quantization, the compressed models produced by the mixed-precision engine result as accurate as the state-of-the-art solutions quantized with a non-uniform function, which is not tailored for CPUs featuring integer-only arithmetic. This denotes the viability of uniform quantization, required for MCU deployments, for deep weights compression. When also limiting the activation memory budget to 512kB, the best MobileNetV1 model scores up to 68.4% on Imagenet thanks to the found quantization policy, resulting to be 4% more accurate than the other 8-bit networks fitting the same memory constraints.

翻译：严重的芯片内存限制目前阻止了在微小微控制器(MICUs)中部署最准确的深神经网络模型(DNN),即使利用有效的8位位数的量化机制。为了解决这个问题,我们在本文件中根据HAQ框架展示了自动混合精度量化流,但为MCU设备的内存和计算特性定制。具体地说,在对RAM和FLAS嵌入的内存大小的严格限制下,强化学习代理物在2,4,8位数中搜索了个人重量和激活声纳的最精确度。我们对MPMNetV1、MPMNetV2和MNasNet用于图像网分类的模型进行了实验性分析。关于四分解政策搜索,RL代理商选择了最优化记忆利用的量化政策。鉴于MCU-级内存为2MB, 混合精度引擎生成的压缩模型准确度为RAMV的状态溶解度溶解度,其最精确的内嵌化的内存能力为5位数级CU1,其最精确的缩缩缩缩缩的缩缩缩缩缩缩缩缩缩缩缩缩缩缩的CU。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CIKM2020】神经逻辑推理，Neural Logic Reasoning

专知会员服务

51+阅读 · 2020年8月25日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日