The exponential growth of devices and data at the edges of the Internet is rising scalability and privacy concerns on approaches based exclusively on remote cloud platforms. Data gravity, a fundamental concept in Fog Computing, points towards decentralisation of computation for data analysis, as a viable alternative to address those concerns. Decentralising AI tasks on several cooperative devices means identifying the optimal set of locations or Collection Points (CP for short) to use, in the continuum between full centralisation (i.e., all data on a single device) and full decentralisation (i.e., data on source locations). We propose an analytical framework able to find the optimal operating point in this continuum, linking the accuracy of the learning task with the corresponding network and computational cost for moving data and running the distributed training at the CPs. We show through simulations that the model accurately predicts the optimal trade-off, quite often an intermediate point between full centralisation and full decentralisation, showing also a significant cost saving w.r.t. both of them. Finally, the analytical model admits closed-form or numeric solutions, making it not only a performance evaluation instrument but also a design tool to configure a given distributed learning task optimally before its deployment.
翻译:互联网边缘的装置和数据的指数增长正在增加,对完全以远程云平台为基础的方法的可缩放性和隐私问题日益引起关注。数据重力是雾化电子计算的一个基本概念,它指出数据分析的计算分散化是解决这些关切的可行替代办法。将一些合作装置的AI任务分散到几个合作装置上意味着确定最佳的一套地点或收集点(即短期的CP),以便在完全集中(即单一装置的所有数据)和完全分散(即源位置的数据)之间的连续体中使用,同时显示相当节省成本的中间点。最后,分析模型承认了封闭式或数字式解决办法,不仅将学习任务的准确性与相应的网络和计算成本联系起来,用于移动数据并在CP进行分布式培训。我们通过模拟表明,模型准确预测了最佳交易地点或收集点(即短期的CP),这往往是一个完全集中和完全分散之间的中间点,同时显示了巨大的成本节约 w.r.t。最后,分析模型承认了封闭式或数字式解决办法,不仅在部署之前,而且将它作为最佳的任务设计工具加以分配。