Acceleration of first order methods is mainly obtained via inertial techniques \`a la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach.
翻译:加速第一顺序方法主要通过惯性技术( ⁇ a la Nesterov)或非线性外推法获得,后者最近发现兴趣激增,成功地应用了梯度和近似梯度技术。在多机学习问题上,协调世系的性能明显优于完全梯度方法。在实际中加快协调世系并非易事:惯性加速型协调世系在理论上加速,但不一定总能导致实际加速。我们提议采用外推法加速协调世系的加速版,与惯性加速协调世系和外推(精度)梯度下降相比,在实际中显示相当快的速度。在最小方的实验、拉索、弹性网和物流回归实验证实了这一方法。