通过装配点优化的 SVM : 新界圈和分布的比值 (SVM via Saddle Point Optimization: New Bounds and Distributed Algorithms)

We study two important SVM variants: hard-margin SVM (for linearly separable cases) and $\nu$-SVM (for linearly non-separable cases). We propose new algorithms from the perspective of saddle point optimization. Our algorithms achieve $(1-\epsilon)$-approximations with running time $\tilde{O}(nd+n\sqrt{d / \epsilon})$ for both variants, where $n$ is the number of points and $d$ is the dimensionality. To the best of our knowledge, the current best algorithm for $\nu$-SVM is based on quadratic programming approach which requires $\Omega(n^2 d)$ time in worst case~\cite{joachims1998making,platt199912}. In the paper, we provide the first nearly linear time algorithm for $\nu$-SVM. The current best algorithm for hard margin SVM achieved by Gilbert algorithm~\cite{gartner2009coresets} requires $O(nd / \epsilon )$ time. Our algorithm improves the running time by a factor of $\sqrt{d}/\sqrt{\epsilon}$. Moreover, our algorithms can be implemented in the distributed settings naturally. We prove that our algorithms require $\tilde{O}(k(d +\sqrt{d/\epsilon}))$ communication cost, where $k$ is the number of clients, which almost matches the theoretical lower bound. Numerical experiments support our theory and show that our algorithms converge faster on high dimensional, large and dense data sets, as compared to previous methods.

翻译：我们研究两个重要的 SVM 变量: 两个变量的硬差 SVM (对于线性分解案例) 和 $\ nu$- SVM (对于线性非可分离案例) 。我们从马鞍点优化的角度提出新的算法。我们的算法在最差的情况下需要$( 1-\ epsilon) $- 套用时间运行时间 $\ tilde{ O} (n+\ sqrt{d\ d /\\ epsilon} ) 。在纸张中, 我们为 $\ 美元支持提供了第一个近线性时间算法, 美元是点数, 美元是维基尔伯特算=cal\ n2009} 美元目前最好的算法, 美元- 美元- SVM 的当前算法以美元运算算算算法, 需要美元/ 美元运算数据。