We consider a resource-constrained Edge Device (ED) embedded with a small-size ML model (S-ML) for a generic classification application, and an Edge Server (ES) that hosts a large-size ML model (L-ML). Since the inference accuracy of S-ML is lower than that of the L-ML, offloading all the data samples to the ES results in high inference accuracy, but it defeats the purpose of embedding S-ML on the ED and deprives the benefits of reduced latency, bandwidth savings, and energy efficiency of doing local inference. To get the best out of both worlds, i.e., the benefits of doing inference on the ED and the benefits of doing inference on ES, we explore the idea of Hierarchical Inference (HI), wherein S-ML inference is only accepted when it is correct, otherwise the data sample is offloaded for L-ML inference. However, the ideal implementation of HI is infeasible as the correctness of the S-ML inference is not known to the ED. We thus propose an online meta-learning framework to predict the correctness of the S-ML inference. The resulting online learning problem turns out to be a Prediction with Expert Advice (PEA) problem with continuous expert space. We consider the full feedback scenario, where the ED receives feedback on the correctness of the S-ML once it accepts the inference, and the no-local feedback scenario, where the ED does not receive the ground truth for the classification, and propose the HIL-F and HIL-N algorithms and prove a regret bound that is sublinear with the number of data samples. We evaluate and benchmark the performance of the proposed algorithms for image classification applications using four datasets, namely, Imagenette, Imagewoof, MNIST, and CIFAR-10.
翻译:我们考虑了嵌入小型机器学习模型(S-ML)的资源受限边缘设备(ED)用于通用分类应用,以及承载大型机器学习模型(L-ML)的边缘服务器(ES)。由于S-ML的推断精度低于L-ML,将所有数据样本离线到ES会导致推断精度高,但它背离了在ED上嵌入S-ML的目的,也丧失了进行本地推断的降低延迟,节省带宽和节约能源的好处。为了充分利用这两个领域,即在ED上进行推断的好处和在ES上进行推断的好处,我们探索了分层推断(HI)的想法,其中仅在S-ML推断正确时才接受推断,否则将数据样本离线到L-ML进行推断。然而,实现HI的理想方法是不可行的,因为ED不知道S-ML推断的正确性。因此,我们提出了一种在线元学习框架来预测S-ML推断的正确性。由此产生的在线学习问题会变成一个具有连续专家空间的预测与专家建议(PEA)问题。我们考虑了完全反馈场景,其中ED接收一次它接受推断时S-ML正确性的反馈;和无本地反馈场景,其中ED不接收分类的真实基础,我们提出了HIL-F和HIL-N算法并证明了与数据样本数量亚线性的遗憾上限。我们使用四个数据集,即Imagenette,Imagewoof,MNIST和CIFAR-10,评估和基准测试了所提出的算法的图像分类应用的性能。