Online learning methods, like the online gradient algorithm (OGA) and exponentially weighted aggregation (EWA), often depend on tuning parameters that are difficult to set in practice. We consider an online meta-learning scenario, and we propose a meta-strategy to learn these parameters from past tasks. Our strategy is based on the minimization of a regret bound. It allows to learn the initialization and the step size in OGA with guarantees. It also allows to learn the prior or the learning rate in EWA. We provide a regret analysis of the strategy. It allows to identify settings where meta-learning indeed improves on learning each task in isolation.
翻译:在线学习方法,如在线梯度算法(OGA)和指数加权汇总(EWA),往往取决于难以在实践中设定的调试参数。我们考虑了在线元学习方案,并提出了从过去的任务中学习这些参数的元战略。我们的战略以尽量减少遗憾为基础,可以有保证地学习OGA的初始化和步数大小。还可以学习EWA的先前或学习率。我们对战略进行遗憾分析。它可以确定元学习在孤立地学习每项任务方面确实有所改进的设置。