The success of deep neural networks (DNNs) is heavily dependent on computational resources. While DNNs are often employed on cloud servers, there is a growing need to operate DNNs on edge devices. Edge devices are typically limited in their computational resources, yet, often multiple edge devices are deployed in the same environment and can reliably communicate with each other. In this work we propose to facilitate the application of DNNs on the edge by allowing multiple users to collaborate during inference to improve their accuracy. Our mechanism, coined {\em edge ensembles}, is based on having diverse predictors at each device, which form an ensemble of models during inference. To mitigate the communication overhead, the users share quantized features, and we propose a method for aggregating multiple decisions into a single inference rule. We analyze the latency induced by edge ensembles, showing that its performance improvement comes at the cost of a minor additional delay under common assumptions on the communication network. Our experiments demonstrate that collaborative inference via edge ensembles equipped with compact DNNs substantially improves the accuracy over having each user infer locally, and can outperform using a single centralized DNN larger than all the networks in the ensemble together.
翻译:深神经网络( DNNS) 的成功在很大程度上取决于计算资源。 虽然 DNNS 常常在云端服务器上使用, 但越来越需要运行 DNS 在边缘设备上。 边端装置通常在计算资源上有限, 但是, 通常多边装置是在同一个环境中部署的, 并且可以可靠地相互交流。 在这项工作中, 我们提议通过允许多个用户在推论中协作来方便DNS在边缘应用, 从而提高其准确性。 我们的机制, 催化了 ~ 边缘 模组 }, 以每个装置有不同的预测器为基础, 形成一个模型的组合。 为了减轻通信管理, 用户共享了量化特性, 我们提出了一个将多个决定整合成单一推论规则的方法。 我们分析了边缘组合引发的拉长, 显示其性能的改进是在通信网络的共同假设下稍加延迟的代价。 我们的实验表明, 通过边端装有压缩 DNNPS 的组合, 大大改进了每个用户的精度, 超过每个中央网络的精度, 超过中央网络的精度。