The majority of IoT devices like smartwatches, smart plugs, HVAC controllers, etc., are powered by hardware with a constrained specification (low memory, clock speed and processor) which is insufficient to accommodate and execute large, high-quality models. On such resource-constrained devices, manufacturers still manage to provide attractive functionalities (to boost sales) by following the traditional approach of programming IoT devices/products to collect and transmit data (image, audio, sensor readings, etc.) to their cloud-based ML analytics platforms. For decades, this online approach has been facing issues such as compromised data streams, non-real-time analytics due to latency, bandwidth constraints, costly subscriptions, recent privacy issues raised by users and the GDPR guidelines, etc. In this paper, to enable ultra-fast and accurate AI-based offline analytics on resource-constrained IoT devices, we present an end-to-end multi-component model optimization sequence and open-source its implementation. Researchers and developers can use our optimization sequence to optimize high memory, computation demanding models in multiple aspects in order to produce small size, low latency, low-power consuming models that can comfortably fit and execute on resource-constrained hardware. The experimental results show that our optimization components can produce models that are; (i) 12.06 x times compressed; (ii) 0.13% to 0.27% more accurate; (iii) Orders of magnitude faster unit inference at 0.06 ms. Our optimization sequence is generic and can be applied to any state-of-the-art models trained for anomaly detection, predictive maintenance, robotics, voice recognition, and machine vision.
翻译:大部分IOT设备,如智能观察、智能插座、HVAC控制器等,都用限制规格(低内存、时速和处理器)的硬件提供动力,这不足以容纳和执行大型、高质量模型。在这种资源限制的装置上,制造商仍设法提供有吸引力的功能(促进销售),办法是采用传统方法编程IOT设备/产品,以收集和传输数据(图像、听觉、感官读数等)到其基于云的 ML 级分析平台。几十年来,这种在线方法一直面临一些问题,如数据流受损、非实时分析器(低内存、带宽限制、高订阅费用、用户最近提出的隐私问题和GDPR准则等。 在本文中,允许超快和准确的 AI 离线设备收集并传输数据(图像、听觉读取) 12个端至端的多构型模型优化序列及其实施。 研究者和开发者可以使用我们的优化序列来优化高记忆、低智能3 、计算高智能模型、多层次的实验性模型、能显示硬体化的硬体化时间。