The Frank-Wolfe algorithm is a popular method in structurally constrained machine learning applications, due to its fast per-iteration complexity. However, one major limitation of the method is a slow rate of convergence that is difficult to accelerate due to erratic, zig-zagging step directions, even asymptotically close to the solution. We view this as an artifact of discretization; that is to say, the Frank-Wolfe \emph{flow}, which is its trajectory at asymptotically small step sizes, does not zig-zag, and reducing discretization error will go hand-in-hand in producing a more stabilized method, with better convergence properties. We propose two improvements: a multistep Frank-Wolfe method that directly applies optimized higher-order discretization schemes; and an LMO-averaging scheme with reduced discretization error, and whose local convergence rate over general convex sets accelerates from a rate of $O(1/k)$ to up to $O(1/k^{3/2})$.
翻译:Frank-Wolfe算法由于其快速的迭代复杂度,在结构约束的机器学习应用中很受欢迎。然而,该方法的一个主要局限性是收敛速度慢,由于在趋近于解的渐进情况下方向是不稳定的,甚至不能很好地加速。我们认为这是离散化的一个副作用。也就是说,Frank-Wolfe的\emph{流},即在渐进小步长处的轨迹,不会发生zig-zag,减少离散化误差将与产生更稳定、更具收敛性质的方法手法相辅相成。我们提出了两个改进:一个多步法和优化的高阶离散化方案直接应用于Frank-Wolfe方法中;和一个具有降低离散化误差的LMO-平均算法,其在普通凸集上的局部收敛速度从$O(1/k)$率加速到$O(1/k^{3/2})$。