与有限通信合作分配学习的信息理论方法 (An Information-theoretic Method for Collaborative Distributed Learning with Limited Communication)

In this paper, we study the information transmission problem under the distributed learning framework, where each worker node is merely permitted to transmit a $m$-dimensional statistic to improve learning results of the target node. Specifically, we evaluate the corresponding expected population risk (EPR) under the regime of large sample sizes. We prove that the performance can be enhanced since the transmitted statistics contribute to estimating the underlying distribution under the mean square error measured by the EPR norm matrix. Accordingly, the transmitted statistics correspond to the eigenvectors of this matrix, and the desired transmission allocates these eigenvectors among the statistics such that the EPR is minimal. Moreover, we provide the analytical solution of the desired statistics for single-node and two-node transmission, where a geometrical interpretation is given to explain the eigenvector selection. For the general case, an efficient algorithm that can output the allocation solution is developed based on the node partitions.

翻译：在本文中,我们研究了分布式学习框架内的信息传输问题,在这个框架内,每个工人节点仅被允许传输一个以百万美元为单位的统计数据,以改善目标节点的学习结果。具体地说,我们评估了在大样本规模制度下相应的预期人口风险。我们证明,由于传输的统计数据有助于估算在平均平方差下按 EPR 规范矩阵测量的平均值差数进行的基本分布,因此,传输的统计数据与该矩阵的源生体相对应,而想要传输的传输在统计数据中分配了这些源生物。此外,我们提供了单节点和二节传输所需统计数据的分析解决方案,其中给出了几何解释解释来解释精子的选择。在一般情况下,能够输出分配解决方案的有效算法是根据节点分区开发的。