Human Activity Recognition (HAR) based on inertial data is an increasingly diffused task on embedded devices, from smartphones to ultra low-power sensors. Due to the high computational complexity of deep learning models, most embedded HAR systems are based on simple and not-so-accurate classic machine learning algorithms. This work bridges the gap between on-device HAR and deep learning, proposing a set of efficient one-dimensional Convolutional Neural Networks (CNNs) deployable on general purpose microcontrollers (MCUs). Our CNNs are obtained combining hyper-parameters optimization with sub-byte and mixed-precision quantization, to find good trade-offs between classification results and memory occupation. Moreover, we also leverage adaptive inference as an orthogonal optimization to tune the inference complexity at runtime based on the processed input, hence producing a more flexible HAR system. With experiments on four datasets, and targeting an ultra-low-power RISC-V MCU, we show that (i) We are able to obtain a rich set of Pareto-optimal CNNs for HAR, spanning more than 1 order of magnitude in terms of memory, latency and energy consumption; (ii) Thanks to adaptive inference, we can derive >20 runtime operating modes starting from a single CNN, differing by up to 10% in classification scores and by more than 3x in inference complexity, with a limited memory overhead; (iii) on three of the four benchmarks, we outperform all previous deep learning methods, reducing the memory occupation by more than 100x. The few methods that obtain better performance (both shallow and deep) are not compatible with MCU deployment. (iv) All our CNNs are compatible with real-time on-device HAR with an inference latency <16ms. Their memory occupation varies in 0.05-23.17 kB, and their energy consumption in 0.005 and 61.59 uJ, allowing years of continuous operation on a small battery supply.
翻译:以惯性数据为基础的人类活动识别 (HAR) 在嵌入式设备上,从智能手机到超低功率传感器,是一个日益分散的任务。 由于深层学习模型的计算复杂性很高, 大部分嵌入的HAR系统都是基于简单且不那么精确的经典机器学习算法。 这项工作缩小了在预设的HAR和深层学习之间的差距, 提出了一套高效的单维神经网络(CNNs), 可用于通用微控制器( MCUs ) 。 我们的CNN将超参数优化与亚精度和混合精度度传感器相结合。 由于深层学习模型的精度和混精度测量, 大部分嵌入的HAR(HAR) 和深层次的光度(CNN) 之间的调和精度的精度之间的交替取值是相同的。 在前一至十年的IMIS- VMC- CUs 中, 其深度的运行方式是更精确的, 其直径不及直径直径直径直径直径直径。