We consider the classical problem of prediction with expert advice. In the fixed-time setting, where the time horizon is known in advance, algorithms that achieve the optimal regret are known when there are two, three, or four experts or when the number of experts is large. Much less is known about the problem in the anytime setting, where the time horizon is not known in advance. No minimax optimal algorithm was previously known in the anytime setting, regardless of the number of experts. Even for the case of two experts, Luo and Schapire have left open the problem of determining the optimal algorithm. We design the first minimax optimal algorithm for minimizing regret in the anytime setting. We consider the case of two experts, and prove that the optimal regret is $\gamma \sqrt{t} / 2$ at all time steps $t$, where $\gamma$ is a natural constant that arose 35 years ago in studying fundamental properties of Brownian motion. The algorithm is designed by considering a continuous analogue of the regret problem, which is solved using ideas from stochastic calculus.
翻译:我们从专家咨询的角度来考虑典型的预测问题。 在固定时间的环境下,时间范围是预先知道的,当有2、3或4名专家或专家人数众多时,就会知道实现最佳遗憾的算法。在时间跨度不为人知、时间跨度不为人知的时段里,对问题知之甚少。无论专家人数多多,在时间跨度上,以前从未知道任何微型最大最佳算法。即使有两位专家,Luo和Schapire,也留下了确定最佳算法的问题。我们设计了第一个在时间跨度上最大限度地减少遗憾的微小算法。我们考虑了2名专家的情况,并证明最好的遗憾是$\gamma\ sqrt{t} / 2$tt, 美元是35年前在研究布朗运动的基本特性时产生的自然常数。算法的设计是考虑一个连续的遗憾问题的类比,这个问题是利用从微量的计算中解决的。