In this paper, we consider the distributed mean estimation problem where the server has access to some side information, e.g., its local computed mean estimation or the received information sent by the distributed clients at the previous iterations. We propose a practical and efficient estimator based on an r-bit Wynzer-Ziv estimator proposed by Mayekar et al., which requires no probabilistic assumption on the data. Unlike Mayekar's work which only utilizes side information at the server, our scheme jointly exploits the correlation between clients' data and server' s side information, and also between data of different clients. We derive an upper bound of the estimation error of the proposed estimator. Based on this upper bound, we provide two algorithms on how to choose input parameters for the estimator. Finally, parameter regions in which our estimator is better than the previous one are characterized.
翻译:在本文中,我们考虑了在服务器能够访问某些侧面信息的分布平均估计问题,例如其本地计算平均估计数或分布客户在以前的迭代中发送的信息。我们根据Mayekar等人提议的 r-bit Wynzer-Ziv spestomator, 提出了一个实用而高效的估算数, 不需要对数据进行概率假设。 与Mayekar只使用服务器侧面信息的工程不同, 我们的计划共同利用客户数据和服务器侧面信息之间的相互关系, 以及不同客户的数据。 我们从中得出了拟议天花板的估计误差的上限。 基于此上限, 我们提供两种算法, 关于如何选择天花的输入参数。 最后, 我们的天花在哪些参数区域比前一个参数要好。