Projection-free optimization algorithms, which are mostly based on the classical Frank-Wolfe method, have gained significant interest in the machine learning community in recent years due to their ability to handle convex constraints that are popular in many applications, but for which computing projections is often computationally impractical in high-dimensional settings, and hence prohibit the use of most standard projection-based methods. In particular, a significant research effort was put on projection-free methods for online learning. In this paper we revisit the Online Frank-Wolfe (OFW) method suggested by Hazan and Kale \cite{Hazan12} and fill a gap that has been left unnoticed for several years: OFW achieves a faster rate of $O(T^{2/3})$ on strongly convex functions (as opposed to the standard $O(T^{3/4})$ for convex but not strongly convex functions), where $T$ is the sequence length. This is somewhat surprising since it is known that for offline optimization, in general, strong convexity does not lead to faster rates for Frank-Wolfe. We also revisit the bandit setting under strong convexity and prove a similar bound of $\tilde O(T^{2/3})$ (instead of $O(T^{3/4})$ without strong convexity). Hence, in the current state-of-affairs, the best projection-free upper-bounds for the full-information and bandit settings with strongly convex and nonsmooth functions match up to logarithmic factors in $T$.
翻译:主要基于经典Frank-Wolfe法的不投影优化算法近年来对机器学习界产生了极大兴趣,这是因为它们有能力处理许多应用中流行的曲线限制,但对于这些限制,计算预测往往在高维环境中计算不切实际,从而禁止使用大多数标准的投影法。特别是,对无投影的在线学习方法进行了大量研究努力。在本文中,我们重新审视了哈赞和Kale &cite{Hazan12}建议的在线Frank-Wolfe(OFW)方法,并填补了多年来一直无人注意的空白:O(T ⁇ 2/3})在高维函数上,计算预测往往不切实际,因此无法计算大部分标准值的投影法。特别是,对“T$”的序列长度进行了大量研究。这有点令人惊讶,因为众所周知,对于离线优化,一般而言,强的组合不会导致弗兰克-Wolx的快速率。 我们还重新审视了“O_3”的硬度,在“O_deax”的绝对值下,在“美元”的直系下,“O_x”的硬度下,在“Ox”的直径”的直值下,在“Ox”的直系内,“Ox”的直系的直系的直系的直系的直系的直系下,(U。