The Frank-Wolfe algorithm is a popular method in structurally constrained machine learning applications, due to its fast per-iteration complexity. However, one major limitation of the method is a slow rate of convergence that is difficult to accelerate due to erratic, zig-zagging step directions, even asymptotically close to the solution. We view this as an artifact of discretization; that is to say, the Frank-Wolfe \emph{flow}, which is its trajectory at asymptotically small step sizes, does not zig-zag, and reducing discretization error will go hand-in-hand in producing a more stabilized method, with better convergence properties. We propose two improvements: a multistep Frank-Wolfe method that directly applies optimized higher-order discretization schemes; and an LMO-averaging scheme with reduced discretization error, and whose local convergence rate over general convex sets accelerates from a rate of $O(1/k)$ to up to $O(1/k^{3/2})$.
翻译:Frank-Wolfe算法是结构约束机器学习应用中常用的方法,由于其每次迭代的复杂度快速。但是,该方法的一个主要缺点是收敛速度很慢,很难加速,因为步长方向令人难以预测,即使是渐进接近解。我们将这视为离散化的一种产物。也就是说,在渐进小的步长下,Frank-Wolfe流的轨迹不会呈现Z字形,降低离散化误差将促进产生更稳定的方法,具有更好的收敛性质。我们提出了两个改进:一种多步Frank-Wolfe方法,直接应用优化的高阶离散化方案;一种LMO-平均方案,具有较小的离散化误差,并且其在通用凸集上的局部收敛速度从$O(1/k)$加速到$O(1/k^{3/2})$。