配有强化学习的分布式边缘群集的推推工作负荷 (Scheduling Inference Workloads on Distributed Edge Clusters with Reinforcement Learning)

Many real-time applications (e.g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) to process inference tasks. Edge computing is considered a key infrastructure to deploy such applications, as moving computation close to the data sources enables us to meet stringent latency and throughput requirements. However, the constrained nature of edge networks poses several additional challenges to the management of inference workloads: edge clusters can not provide unlimited processing power to DNN models, and often a trade-off between network and processing time should be considered when it comes to end-to-end delay requirements. In this paper, we focus on the problem of scheduling inference queries on DNN models in edge networks at short timescales (i.e., few milliseconds). By means of simulations, we analyze several policies in the realistic network settings and workloads of a large ISP, highlighting the need for a dynamic scheduling policy that can adapt to network conditions and workloads. We therefore design ASET, a Reinforcement Learning based scheduling algorithm able to adapt its decisions according to the system conditions. Our results show that ASET effectively provides the best performance compared to static policies when scheduling over a distributed pool of edge resources.

翻译：许多实时应用程序(例如,增强/虚拟现实、认知援助)依赖深神经网络(DNN)处理推理任务。边缘计算被视为部署此类应用程序的关键基础设施,因为将计算方法移近数据源,使我们能够满足严格的悬浮和吞吐量要求。然而,边缘网络的有限性质给推理工作量的管理带来了若干额外挑战:边缘集群不能为DNN模型提供无限的处理能力,在终端到终端延迟要求时,常常应考虑网络和处理时间之间的权衡。在本文件中,我们侧重于在短时间尺度(即几毫秒)将边缘网络DNN模型的推论查询安排在边缘网络上的问题。通过模拟,我们分析了大型ISP现实的网络设置和工作量方面的若干政策,强调需要有能够适应网络条件和工作量的动态排期政策。我们因此设计了ASET,一种基于强化的时间安排法,能够根据系统边际条件调整其决定。我们的成果显示,相对于静态资源分布时,AST政策可以有效地提供相对于静态的进度。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日