TensorFlow Lite Micro: 嵌入式机器学习小ML系统 (TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems)

Robert David,Jared Duke,Advait Jain,Vijay Janapa Reddi,Nat Jeffries,Jian Li,Nick Kreeger,Ian Nappier,Meghna Natraj,Shlomi Regev,Rocky Rhodes,Tiezhen Wang,Pete Warden

Deep learning inference on embedded devices is a burgeoning field with myriad applications because tiny embedded devices are omnipresent. But we must overcome major challenges before we can benefit from this opportunity. Embedded processors are severely resource constrained. Their nearest mobile counterparts exhibit at least a 100 -- 1,000x difference in compute capability, memory availability, and power consumption. As a result, the machine-learning (ML) models and associated ML inference framework must not only execute efficiently but also operate in a few kilobytes of memory. Also, the embedded devices' ecosystem is heavily fragmented. To maximize efficiency, system vendors often omit many features that commonly appear in mainstream systems, including dynamic memory allocation and virtual memory, that allow for cross-platform interoperability. The hardware comes in many flavors (e.g., instruction-set architecture and FPU support, or lack thereof). We introduce TensorFlow Lite Micro (TF Micro), an open-source ML inference framework for running deep-learning models on embedded systems. TF Micro tackles the efficiency requirements imposed by embedded-system resource constraints and the fragmentation challenges that make cross-platform interoperability nearly impossible. The framework adopts a unique interpreter-based approach that provides flexibility while overcoming these challenges. This paper explains the design decisions behind TF Micro and describes its implementation details. Also, we present an evaluation to demonstrate its low resource requirement and minimal run-time performance overhead.

翻译：嵌入装置的深层学习推论是一个充满各种应用的新兴领域,因为嵌入装置很小,因此,嵌入装置无处不在。但是,我们必须克服重大挑战,才能从这一机会中获益。嵌入式处理器受到严重的资源限制。最接近的移动对应器在计算能力、内存可用性和电耗方面至少有100 - 1 000x的差异。因此,机器学习模型和相关的ML推论框架不仅必须高效实施,而且必须在几千字节的记忆中运行。此外,嵌入装置的生态系统严重分散。为了最大限度地提高效率,系统供应商往往省略主流系统中通常出现的许多特征,包括动态记忆分配和虚拟记忆,这些特征允许跨平台互操作性互操作性。硬件以多种口味出现(例如,教学设置架构和FPUP支持,或缺乏这种支持)。我们引入了TensorFlow Micro(TF Micro),这是一个用于在嵌入式系统中运行深层学习模型的开源 ML推论框架。TFML的微推论是高度分散的。为了满足嵌入式系统的资源要求,由嵌入式系统所设定的效率要求,而由嵌入式系统内系统内系统制约和碎化的系统制约,以及分散化的系统背后的系统提出了效率要求,而使这种灵活性性要求,而我们几乎无法解释解释解释出一个独特的设计决定。

相关内容

MICRO

关注 0

MICRO：IEEE/ACM International Symposium on Microarchitecture Explanation：IEEE/ACM微体系结构国际研讨会。 Publisher：IEEE/ACM。 SIT:https://dblp.uni-trier.de/db/conf/micro/

TensorFlow Lite指南实战《TensorFlow Lite A primer》，附48页PPT

专知会员服务

70+阅读 · 2020年1月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日