Computer vision on low-power edge devices enables applications including search-and-rescue and security. State-of-the-art computer vision algorithms, such as Deep Neural Networks (DNNs), are too large for inference on low-power edge devices. To improve efficiency, some existing approaches parallelize DNN inference across multiple edge devices. However, these techniques introduce significant communication and synchronization overheads or are unable to balance workloads across devices. This paper demonstrates that the hierarchical DNN architecture is well suited for parallel processing on multiple edge devices. We design a novel method that creates a parallel inference pipeline for computer vision problems that use hierarchical DNNs. The method balances loads across the collaborating devices and reduces communication costs to facilitate the processing of multiple video frames simultaneously with higher throughput. Our experiments consider a representative computer vision problem where image recognition is performed on each video frame, running on multiple Raspberry Pi 4Bs. With four collaborating low-power edge devices, our approach achieves 3.21X higher throughput, 68% less energy consumption per device per frame, and 58% decrease in memory when compared with existing single-device hierarchical DNNs.
翻译:低功率边缘装置的计算机视觉使包括搜索和救援及安全在内的应用程序得以应用。 最先进的计算机视觉算法, 如深神经网络( DNN) 过于庞大, 无法对低功率边缘装置进行推断。 为了提高效率, 有些现有方法将多边缘装置的DNN推法平行化。 但是, 这些技术引入了重要的通信和同步管理管理, 或者无法平衡各装置的工作量。 本文表明, 等级的 DNN 结构非常适合在多边缘装置上平行处理。 我们设计了一种新颖的方法, 为使用等级DNN 的计算机视觉问题建立一个平行的推论管道。 这种方法平衡了协作装置之间的负荷, 并降低了通信成本, 以便利处理多个视频框架, 与更高的吞吐量同时进行。 我们的实验认为, 具有代表性的计算机视觉问题, 在每个视频框上进行图像识别, 运行在多个 Rasperry Pi 4Bs。 有四个协作的低功率边缘装置, 我们的方法达到了3. 21X, 每框架的能量消耗量增加68%, 每个装置的记忆减少58%。