Many recent pattern recognition applications rely on complex distributed architectures in which sensing and computational nodes interact together through a communication network. Deep neural networks (DNNs) play an important role in this scenario, furnishing powerful decision mechanisms, at the price of a high computational effort. Consequently, powerful state-of-the-art DNNs are frequently split over various computational nodes, e.g., a first part stays on an embedded device and the rest on a server. Deciding where to split a DNN is a challenge in itself, making the design of deep learning applications even more complicated. Therefore, we propose Split-Et-Impera, a novel and practical framework that i) determines the set of the best-split points of a neural network based on deep network interpretability principles without performing a tedious try-and-test approach, ii) performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements, and iii) suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time.
翻译:许多最近的模式识别应用程序依赖于复杂的分布式架构,其中感知和计算节点通过通信网络相互交互。深度神经网络(DNNs)在这种情况下发挥着重要作用,提供了强大的决策机制,但代价是高计算成本。因此,先进的DNNs被分割到多个计算节点上,例如,一部分留在嵌入式设备上,其余部分留在服务器上。决定在哪里分割DNN本身就是一个挑战,使深度学习应用程序的设计变得更加复杂。因此,我们提出了Split-Et-Impera, 一种新颖而实用的框架,它i) 基于深入的网络可解释性原则确定神经网络的最佳分割点的集合,而不需要进行繁琐的尝试和测试方法; ii) 执行一个通信感知仿真来快速评估不同的神经网络重新排列, iii) 在精度和延迟时间性能方面提供满足应用程序的服务质量要求的最佳匹配建议。