Optimistic Online Learning aims to exploit experts conveying reliable information to predict the future. However, such implicit optimism may be challenged when it comes to practical crafting of such experts. A fundamental example consists in approximating a minimiser of the current problem and use it as expert. In the context of dynamic environments, such an expert only conveys partially relevant information as it may lead to overfitting. To tackle this issue, we introduce in this work the \emph{optimistically tempered} (OT) online learning framework designed to handle such imperfect experts. As a first contribution, we show that tempered optimism is a fruitful paradigm for Online Non-Convex Learning by proposing simple, yet powerful modification of Online Gradient and Mirror Descent. Second, we derive a second OT algorithm for convex losses and third, evaluate the practical efficiency of tempered optimism on real-life datasets and a toy experiment.
翻译:乐观在线学习旨在利用传递可靠信息的专家来预测未来。然而,当实际构建此类专家时,这种隐含的乐观性可能面临挑战。一个基本示例涉及近似当前问题的极小值点并将其用作专家。在动态环境中,此类专家仅传递部分相关信息,因为它可能导致过拟合。为解决这一问题,我们在本文中引入了旨在处理此类不完美专家的“乐观温度化”在线学习框架。作为首个贡献,我们通过提出对在线梯度下降和镜像下降的简单而强大的修改,证明了温度化乐观策略对于在线非凸学习是一种富有成效的范式。其次,我们推导了针对凸损失函数的第二种OT算法。第三,我们在真实数据集和一个玩具实验上评估了温度化乐观策略的实际效率。