Significant advances in edge computing capabilities enable learning to occur at geographically diverse locations. In general, the training data needed in those learning tasks are not only heterogeneous but also not fully generated locally. In this paper, we propose an experimental design network paradigm, wherein learner nodes train possibly different Bayesian linear regression models via consuming data streams generated by data source nodes over a network. We formulate this problem as a social welfare optimization problem in which the global objective is defined as the sum of experimental design objectives of individual learners, and the decision variables are the data transmission strategies subject to network constraints. We first show that, assuming Poisson data streams, the global objective is a continuous DR-submodular function. We then propose a Frank-Wolfe type algorithm that outputs a solution within a 1-1/e factor from the optimal. Our algorithm contains a novel gradient estimation component which is carefully designed based on Poisson tail bounds and sampling. Finally, we complement our theoretical findings through extensive experiments. Our numerical evaluation shows that the proposed algorithm outperforms several baseline algorithms both in maximizing the global objective and in the quality of the trained models.
翻译:边际计算能力的重大进步能够使学习在地理上各异的地点进行。一般而言,这些学习任务所需的培训数据不仅多种多样,而且没有在当地完全产生。在本文中,我们提出一个实验性设计网络模式,让学习者节点通过使用数据源节点产生的数据流来培训不同的贝叶斯线性回归模型,我们将此问题描述为一个社会福利优化问题,将全球目标定义为个别学习者的实验设计目标的总和,而决定变量是受网络限制的数据传输战略。我们首先显示,假设 Poisson 数据流,全球目标是一个连续的DR-Submodumodular函数。我们然后提议一个弗兰克-沃夫式算法,从最佳的1-1/e系数中输出一个解决方案。我们的算法包含一个新的梯度估计部分,它是根据Poisson尾线和取样精心设计的。最后,我们通过广泛的实验来补充我们的理论结论。我们的数字评估表明,在最大限度地实现全球目标方面,以及在经过培训的模型的质量方面,拟议的算法优于几种基线算法。