Edge inference is becoming ever prevalent through its applications from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet there is no production-ready orchestration system for deploying deep learning models over such edge networks which adopts the robustness and scalability of the cloud. We present SEIFER, a framework utilizing a standalone Kubernetes cluster to partition a given DNN and place these partitions in a distributed manner across an edge network, with the goal of maximizing inference throughput. The system is node fault-tolerant and automatically updates deployments based on updates to the model's version. We provide a preliminary evaluation of a partitioning and placement algorithm that works within this framework, and show that we can improve the inference pipeline throughput by 200% by utilizing sufficient numbers of resource-constrained nodes. We have implemented SEIFER in open-source software that is publicly available to the research community.
翻译:从零售到可磨损技术的应用越来越普遍。 网络资源限制边缘装置的集群正在变得司空见惯,然而,在利用云的坚固性和可缩放性的边缘网络上部署深学习模型方面,却没有现成的供生产使用的管弦系统。 我们提出SEIFER,这是一个利用独立Kubernetes集群将特定DNN分割开来的框架,并将这些分区分布在一个边缘网络上,目标是最大限度地增加引力。这个系统是节点过量,根据模型版本的更新,自动更新部署情况。我们初步评估了在这个框架内运行的分隔和布置算法,并表明我们可以利用足够数量的资源限制节点,将输油管输送量提高200%。我们已经在公开提供给研究界的开放源软件中应用了SEIFER。