The online meta-learning framework has arisen as a powerful tool for the continual lifelong learning setting. The goal for an agent is to quickly learn new tasks by drawing on prior experience, while it faces with tasks one after another. This formulation involves two levels: outer level which learns meta-learners and inner level which learns task-specific models, with only a small amount of data from the current task. While existing methods provide static regret analysis for the online meta-learning framework, we establish performance in terms of dynamic regret which handles changing environments from a global prospective. We also build off of a generalized version of the adaptive gradient methods that covers both ADAM and ADAGRAD to learn meta-learners in the outer level. We carry out our analyses in a stochastic setting, and in expectation prove a logarithmic local dynamic regret which depends explicitly on the total number of iterations T and parameters of the learner. Apart from, we also indicate high probability bounds on the convergence rates of proposed algorithm with appropriate selection of parameters, which have not been argued before.
翻译:在线元学习框架已成为持续终身学习环境的有力工具。 一个代理机构的目标是通过吸取以往的经验,快速学习新任务,同时面对另一个又一个任务。这一提法涉及两个层面:外层,学习元 Learner 和内层,学习具体任务模型,只有一小部分当前任务的数据。虽然现有方法为在线元学习框架提供了静态的遗憾分析,但我们在动态遗憾方面建立了业绩,从全球前景中处理不断变化的环境。我们还建立了适应性梯度方法的通用版本,既包括ADAM,也包括ADAGRAD,以学习外层的元学习。我们的分析是在一个随机的场景中进行,预期会证明当地动态遗憾是逻辑性的,这明确取决于迭代数的总数和学习者的参数。此外,我们还指明了拟议算法与适当选择参数的趋同率的高度概率,这些参数以前从未争论过。