变式平行主义:IoT设备分布式推论的轻量深演动模型 (Variant Parallelism: Lightweight Deep Convolutional Models for Distributed Inference on IoT Devices)

Two major techniques are commonly used to meet real-time inference limitations when distributing models across resource-constrained IoT devices: (1) model parallelism (MP) and (2) class parallelism (CP). In MP, transmitting bulky intermediate data (orders of magnitude larger than input) between devices imposes huge communication overhead. Although CP solves this problem, it has limitations on the number of sub-models. In addition, both solutions are fault intolerant, an issue when deployed on edge devices. We propose variant parallelism (VP), an ensemble-based deep learning distribution method where different variants of a main model are generated and can be deployed on separate machines. We design a family of lighter models around the original model, and train them simultaneously to improve accuracy over single models. Our experimental results on six common mid-sized object recognition datasets demonstrate that our models can have 5.8-7.1x fewer parameters, 4.3-31x fewer multiply-accumulations (MACs), and 2.5-13.2x less response time on atomic inputs compared to MobileNetV2 while achieving comparable or higher accuracy. Our technique easily generates several variants of the base architecture. Each variant returns only 2k outputs 1 <= k <= (#classes/2), representing Top-k classes, instead of tons of floating point values required in MP. Since each variant provides a full-class prediction, our approach maintains higher availability compared with MP and CP in presence of failure.

翻译：通常使用两种主要技术来在资源受限制的IoT设备中分配模型时满足实时推断限制:(1) 模型平行(MP) 和(2) 类平行(CP)。在MP中,在设备之间传输大中型数据(数量大于投入的数量)需要巨大的通信间接费用。虽然CP解决了这个问题,但对于子模型的数量却有限制。此外,两种解决方案都是不耐烦症,这是在边缘装置上部署时的一个问题。我们提议了不同的平行(VP),一种基于共同的深层次学习分配方法,其中生成了一种主要模型的不同变体,并且可以部署在不同的机器上。我们在原始模型上设计了一个更轻的模型系列,同时培训它们以提高单一模型的准确性。我们在六个共同的中小型物体识别数据集上的实验结果表明,我们的模型的参数可能减少5.8-7.1x,4.3-3-31x乘积(MACs)减少倍积(MACs),在原子投入上2.5-13.2x的响应时间比移动网络2,同时实现可比或更高的准确性。我们的技术很容易产生几种变式,在基础结构中的几种变数,每个变数,每个变数(K)仅代表了最高变式的模型,每个变数,每个变数,每个变数为Kxx。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/