Collaborative inference has received significant research interest in machine learning as a vehicle for distributing computation load, reducing latency, as well as addressing privacy preservation in communications. Recent collaborative inference frameworks have adopted dynamic inference methodologies such as early-exit and run-time partitioning of neural networks. However, as machine learning frameworks scale in the number of inference inputs, e.g., in surveillance applications, fault tolerance related to device failure needs to be considered. This paper presents the Edge-PRUNE distributed computing framework, built on a formally defined model of computation, which provides a flexible infrastructure for fault tolerant collaborative inference. The experimental section of this work shows results on achievable inference time savings by collaborative inference, presents fault tolerant system topologies and analyzes their cost in terms of execution time overhead.
翻译:合作推论在研究中受到极大关注,认为机器学习是分配计算负荷、减少潜伏以及处理通信中隐私保护的工具,最近的协作推论框架采用了神经网络提前出场和运行时分割等动态推论方法,然而,作为计算投入数量的机械学习框架尺度,例如在监视应用中,需要考虑与装置故障有关的过错容忍度。本文介绍了以正式确定的计算模型为基础的边远-PRUNE分布式计算框架,该框架为过失容忍协作推论提供了灵活的基础设施。本项工作的实验部分显示了通过合作推论可以实现的推论时间节省的结果,提出了容忍系统结构的错误,并分析了其执行时间管理的成本。