Prediction with experts' advice is one of the most fundamental problems in online learning and captures many of its technical challenges. A recent line of work has looked at online learning through the lens of differential equations and continuous-time analysis. This viewpoint has yielded optimal results for several problems in online learning. In this paper, we employ continuous-time stochastic calculus in order to study the discrete-time experts' problem. We use these tools to design a continuous-time, parameter-free algorithm with improved guarantees for the quantile regret. We then develop an analogous discrete-time algorithm with a very similar analysis and identical quantile regret bounds. Finally, we design an anytime continuous-time algorithm with regret matching the optimal fixed-time rate when the gains are independent Brownian Motions; in many settings, this is the most difficult case. This gives some evidence that, even with adversarial gains, the optimal anytime and fixed-time regrets may coincide.
翻译:对专家意见的预测是在线学习的最根本问题之一,并捕捉到其许多技术挑战。最近的一项工作通过差异方程和连续时间分析的透镜审视了在线学习。这个观点为在线学习中的若干问题产生了最佳结果。在本文中,我们使用连续时间的随机微积分来研究离散时间专家的问题。我们利用这些工具设计一个连续的时间、无参数的算法,并改进了对量化遗憾的保证。然后我们开发了一种类似的离散时间算法,其分析非常相似,并具有相同的量化遗憾界限。最后,我们设计了一种随时连续的算法,在收益是独立的布朗尼亚运动时,与最佳固定时间率相匹配,这是许多情况下最困难的情况。这提供了一些证据,即使有对抗性收益,最理想的时间和固定时间的遗憾也可能同时出现。