We propose a distributed cubic regularization of the Newton method for solving (constrained) empirical risk minimization problems over a network of agents, modeled as undirected graph. The algorithm employs an inexact, preconditioned Newton step at each agent's side: the gradient of the centralized loss is iteratively estimated via a gradient-tracking consensus mechanism and the Hessian is subsampled over the local data sets. No Hessian matrices are thus exchanged over the network. We derive global complexity bounds for convex and strongly convex losses. Our analysis reveals an interesting interplay between sample and iteration/communication complexity: statistically accurate solutions are achievable in roughly the same number of iterations of the centralized cubic Newton method, with a communication cost per iteration of the order of $\widetilde{\mathcal{O}}\big(1/\sqrt{1-\rho}\big)$, where $\rho$ characterizes the connectivity of the network. This demonstrates a significant communication saving with respect to that of existing, statistically oblivious, distributed Newton-based methods over networks.
翻译:我们建议对牛顿解决(受限制的)实证风险最小化问题的方法进行分布式的立方正规化,在代理商的网络上,以未定向的图表为模型。算法在每个代理商的侧面使用不精确的、有先决条件的牛顿步骤:集中损失的梯度通过梯度跟踪共识机制进行迭代估计,赫森对本地数据集进行子取样。因此,网络上没有交换赫森基质。我们从全球复杂度中得出锥形和强烈锥形损失。我们的分析揭示了抽样和迭代/通信复杂性之间的令人感兴趣的相互作用:在集中的牛顿立方法的迭代数量上,统计准确的解决方案大致可以实现,按美元全方位的顺序来计算通信成本。O ⁇ big(1/\qrt{1-\rho ⁇ big),其中美元是网络连接的特征。这显示了现有、统计上遗忘的、基于纽顿的基于网络的方法在网络上的巨大通信节约。