We consider the problem of sparse normal means estimation in a distributed setting with communication constraints. We assume there are $M$ machines, each holding a $d$-dimensional observation of a $K$-sparse vector $\mu$ corrupted by additive Gaussian noise. A central fusion machine is connected to the $M$ machines in a star topology, and its goal is to estimate the vector $\mu$ with a low communication budget. Previous works have shown that to achieve the centralized minimax rate for the $\ell_2$ risk, the total communication must be high - at least linear in the dimension $d$. This phenomenon occurs, however, at very weak signals. We show that once the signal-to-noise ratio (SNR) is slightly higher, the support of $\mu$ can be correctly recovered with much less communication. Specifically, we present two algorithms for the distributed sparse normal means problem, and prove that above a certain SNR threshold, with high probability, they recover the correct support with total communication that is sublinear in the dimension $d$. Furthermore, the communication decreases exponentially as a function of signal strength. If in addition $KM\ll d$, then with an additional round of sublinear communication, our algorithms achieve the centralized rate for the $\ell_2$ risk. Finally, we present simulations that illustrate the performance of our algorithms in different parameter regimes.
翻译:我们考虑的是在分布式环境下在通信受限制的情况下对正常手段进行稀薄估算的问题。 我们假设有机器,每台机器持有价值为1K$-scarse 矢量的元值观测,每台持有1K$-smarse矢量的美元=mum美元=mumo美元; 中央集聚机与恒星地形学的美元机器连接起来, 其目标是对分布式正常手段问题提出两种算法, 并且低通信预算。 先前的工作表明,要达到集中式最低通量率, 总通量必须是高的, 至少在维度上是线性的。 然而, 这种现象是在非常弱的信号下发生的。 我们表明,一旦信号- 噪音比(SNR) 略高一点, 美元的支持可以用更少的通信来正确恢复。 具体地说, 我们对分布式稀薄的正常手段问题提出了两种算法, 并且证明,如果超过一定的SNR阈值, 它们恢复了对维度为$-2美元的总通信的正确支持。 此外, 通信量指数指数以指数指数指数递减速递减, 。