We consider parameter estimation in distributed networks, where each sensor in the network observes an independent sample from an underlying distribution and has $k$ bits to communicate its sample to a centralized processor which computes an estimate of a desired parameter. We develop lower bounds for the minimax risk of estimating the underlying parameter for a large class of losses and distributions. Our results show that under mild regularity conditions, the communication constraint reduces the effective sample size by a factor of $d$ when $k$ is small, where $d$ is the dimension of the estimated parameter. Furthermore, this penalty reduces at most exponentially with increasing $k$, which is the case for some models, e.g., estimating high-dimensional distributions. For other models however, we show that the sample size reduction is re-mediated only linearly with increasing $k$, e.g. when some sub-Gaussian structure is available. We apply our results to the distributed setting with product Bernoulli model, multinomial model, Gaussian location models, and logistic regression which recover or strengthen existing results. Our approach significantly deviates from existing approaches for developing information-theoretic lower bounds for communication-efficient estimation. We circumvent the need for strong data processing inequalities used in prior work and develop a geometric approach which builds on a new representation of the communication constraint. This approach allows us to strengthen and generalize existing results with simpler and more transparent proofs.
翻译:我们考虑分布式网络的参数估计,在分布式网络中,网络中的每个传感器都观察着一个从基本分布范围得出的独立样本,并拥有一美元比特元,将其样本传送给一个中央处理器,该处理器计算出一个理想参数的估计数。我们为估算大量损失和分布类基本参数的最小最大风险开发了较低的界限。我们的结果显示,在温和的常规条件下,通信限制将有效的样本规模降低以美元为因数,当美元为小时,当通信限制将有效的样本规模降低以美元为因数,而美元为美元是估计参数的维度。此外,这一处罚因美元的增长而最大幅度地减少,例如,对一些模型(例如,估算高维分布值的模型)而言,则以美元为单位,将样本的样本向下限值告知。对于某些模型来说,我们采用的方法大大偏离了现有的方法,例如,即:通过增加美元对样本规模的缩略度的缩略图进行重新进行补救,例如,当某些亚欧结构出现时,我们将将我们的结果应用到分布式的设置为Bernouloulouloulli模型、多数值模型、高度模型、高估测测测测位模型,以及物流回归,从而恢复或加强现有结果。我们采用一种更精确的计算方法,从而在开发了现有数据偏向后,从而可以加强现有数据偏向前的地理测量分析方法,从而可以加强现有数据偏向后进行新的数据偏向式,从而加强现有数据偏向式的方法,从而加强现有数据偏向式的精确测测测测制。