In the online learning with experts problem, an algorithm must make a prediction about an outcome on each of $T$ days (or times), given a set of $n$ experts who make predictions on each day (or time). The algorithm is given feedback on the outcomes of each day, including the cost of its prediction and the cost of the expert predictions, and the goal is to make a prediction with the minimum cost, specifically compared to the best expert in the set. Recent work by Srinivas, Woodruff, Xu, and Zhou (STOC 2022) introduced the study of the online learning with experts problem under memory constraints. However, often the predictions made by experts or algorithms at some time influence future outcomes, so that the input is adaptively chosen. Whereas deterministic algorithms would be robust to adaptive inputs, existing algorithms all crucially use randomization to sample a small number of experts. In this paper, we study deterministic and robust algorithms for the experts problem. We first show a space lower bound of $\widetilde{\Omega}\left(\frac{nM}{RT}\right)$ for any deterministic algorithm that achieves regret $R$ when the best expert makes $M$ mistakes. Our result shows that the natural deterministic algorithm, which iterates through pools of experts until each expert in the pool has erred, is optimal up to polylogarithmic factors. On the positive side, we give a randomized algorithm that is robust to adaptive inputs that uses $\widetilde{O}\left(\frac{n}{R\sqrt{T}}\right)$ space for $M=O\left(\frac{R^2 T}{\log^2 n}\right)$, thereby showing a smooth space-regret trade-off.
翻译:在与专家进行在线学习的过程中,算法必须预测每3天(或时)的产值,考虑到每天(或时)作出预测的一组美元专家。算法对每一天(或时)的产值作出反馈,包括预测的成本和专家预测的成本,目标是以最低成本,特别是相对于最佳专家进行预测。Srinivas、Woodruff、Xu和Zhou(STOC 2022)最近的工作引入了与专家进行在线学习的研究,在记忆制约下,专家与专家进行在线学习的问题。然而,专家或算法在某个时候作出的预测往往会影响未来的结果,因此,对投入的选择是适应性选择的。而现有的算法对适应性投入将非常可靠,而对现有专家进行随机化。在本文中,我们研究专家的确定性和稳健的算法。我们首先展示了“全方位”交易的低空空基值,在某个时候,专家的确定性推算结果就是通过“美元”(R_RT_right)显示我们的任何确定性算结果。</s>