Inspired by the demands of real-time climate and weather forecasting, we develop optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delayed feedback. Our algorithms -- DORM, DORM+, and AdaHedgeD -- arise from a novel reduction of delayed online learning to optimistic online learning that reveals how optimistic hints can mitigate the regret penalty caused by delay. We pair this delay-as-optimism perspective with a new analysis of optimistic learning that exposes its robustness to hinting errors and a new meta-algorithm for learning effective hinting strategies in the presence of delay. We conclude by benchmarking our algorithms on four subseasonal climate forecasting tasks, demonstrating low regret relative to state-of-the-art forecasting models.
翻译:受实时气候和天气预报要求的启发,我们开发了乐观的在线学习算法,这些算法不需要参数调整,并且在延迟反馈下有最佳的遗憾保障。我们的算法 -- -- DORM、DORM+和AdaHedgeD -- -- 源自于新颖的延迟在线学习减为乐观的在线学习,它揭示了乐观的提示可以如何减轻延迟造成的遗憾惩罚。我们把这种迟到乐观的视角与乐观学习的新分析相提并论,它暴露了在延迟情况下学习有效暗示战略的稳健性和新的元值。我们通过将我们的算法设定为四种季节以下气候预报任务的基准,显示了相对于最新预测模型的低遗憾。