Severe constraints on memory and computation characterizing the Internet-of-Things (IoT) units may prevent the execution of Deep Learning (DL)-based solutions, which typically demand large memory and high processing load. In order to support a real-time execution of the considered DL model at the IoT unit level, DL solutions must be designed having in mind constraints on memory and processing capability exposed by the chosen IoT technology. In this paper, we introduce a design methodology aiming at allocating the execution of Convolutional Neural Networks (CNNs) on a distributed IoT application. Such a methodology is formalized as an optimization problem where the latency between the data-gathering phase and the subsequent decision-making one is minimized, within the given constraints on memory and processing load at the units level. The methodology supports multiple sources of data as well as multiple CNNs in execution on the same IoT system allowing the design of CNN-based applications demanding autonomy, low decision-latency, and high Quality-of-Service.
翻译:在设计DL解决方案时,必须铭记选定的IOT技术所暴露的记忆和处理能力方面的限制。在本文件中,我们采用了一种设计方法,旨在分配在分布式IOT应用程序上执行Convolution Neal Network(CNNs)的应用程序。这种方法被正式确定为优化问题,因为数据收集阶段与随后决策阶段之间的间隔在单位一级对记忆和处理负荷的一定限制范围内,最大限度地缩小了对DL模式的实时执行。该方法支持多种数据来源以及在同一IOT系统中执行的多个CNN,以便设计基于CNN的应用程序,要求自主、低决策时间和高服务质量。