Bregman proximal point algorithm (BPPA), as one of the centerpieces in the optimization toolbox, has been witnessing emerging applications. With simple and easy to implement update rule, the algorithm bears several compelling intuitions for empirical successes, yet rigorous justifications are still largely unexplored. We study the computational properties of BPPA through classification tasks with separable data, and demonstrate provable algorithmic regularization effects associated with BPPA. We show that BPPA attains non-trivial margin, which closely depends on the condition number of the distance generating function inducing the Bregman divergence. We further demonstrate that the dependence on the condition number is tight for a class of problems, thus showing the importance of divergence in affecting the quality of the obtained solutions. In addition, we extend our findings to mirror descent (MD), for which we establish similar connections between the margin and Bregman divergence. We demonstrate through a concrete example, and show BPPA/MD converges in direction to the maximal margin solution with respect to the Mahalanobis distance. Our theoretical findings are among the first to demonstrate the benign learning properties BPPA/MD, and also provide corroborations for a careful choice of divergence in the algorithmic design.
翻译:BBPA是优化工具箱中的核心要素之一,BBPPA(BPPA)作为优化工具箱中的核心点算法(BPPA)一直在见证各种应用。随着简单而容易地实施更新规则,该算法对经验成功与否具有若干令人信服的直觉,但严格的理由基本上尚未探讨。我们通过分类任务,用可分离的数据研究BPPA的计算特性,并展示与BPPA有关的可证实的算法正常化效果。我们通过具体实例表明,BPPAA/MD在马哈拉诺比距离方面,接近于最大差幅解决方案的方向。我们进一步证明,对条件号的依赖对于某类问题来说是紧密的,从而表明差异在影响获得的解决办法的质量方面的重要性。此外,我们将我们的调查结果扩大到镜像的下降(MD),为此我们在差幅与布雷格曼差异之间建立了相似的联系。我们通过一个具体的例子来证明,并显示BPPA/MD在马哈拉诺比斯距离方面,走向最大差值解决方案的方向是方向。我们的理论结论结论结论结论是首先展示BPPPA/MAqulnialevalevaction设计的精确选择。