We consider the online linear optimization problem, where at every step the algorithm plays a point $x_t$ in the unit ball, and suffers loss $\langle c_t, x_t\rangle$ for some cost vector $c_t$ that is then revealed to the algorithm. Recent work showed that if an algorithm receives a hint $h_t$ that has non-trivial correlation with $c_t$ before it plays $x_t$, then it can achieve a regret guarantee of $O(\log T)$, improving on the bound of $\Theta(\sqrt{T})$ in the standard setting. In this work, we study the question of whether an algorithm really requires a hint at every time step. Somewhat surprisingly, we show that an algorithm can obtain $O(\log T)$ regret with just $O(\sqrt{T})$ hints under a natural query model; in contrast, we also show that $o(\sqrt{T})$ hints cannot guarantee better than $\Omega(\sqrt{T})$ regret. We give two applications of our result, to the well-studied setting of optimistic regret bounds and to the problem of online learning with abstention.
翻译:我们考虑了在线线性优化问题, 当算法每一步在单球中扮演一个点$_t美元, 并在一个成本矢量中损失$\langle c_t, x_t\rangle$, 然后向算法披露它。 最近的工作显示, 如果算法收到一个提示$_t$, 与美元没有三重关系, 在它使用$xt之前与美元有非三重关系, 那么它就能在标准设置中获得一个0( log T)$的遗憾保证, 在$( scrt{T} 的约束上, 并得到了改善。 在这项工作中, 我们研究一个算法是否真正需要每一步提示的问题。 某些令人惊讶的是, 我们显示算法在自然查询模式下只要美元(\ qrt{T} 美元就能获得$( right) $( $) 的遗憾; 相比之下, 我们还显示, $o( sqrt{T} 的提示无法保证比 $\\\/ Omqrt\\\ delist roflegres to we laus laus pro) 。