Despite the superior empirical success of deep meta-learning, theoretical understanding of overparameterized meta-learning is still limited. This paper studies the generalization of a widely used meta-learning approach, Model-Agnostic Meta-Learning (MAML), which aims to find a good initialization for fast adaptation to new tasks. Under a mixed linear regression model, we analyze the generalization properties of MAML trained with SGD in the overparameterized regime. We provide both upper and lower bounds for the excess risk of MAML, which captures how SGD dynamics affect these generalization bounds. With such sharp characterizations, we further explore how various learning parameters impact the generalization capability of overparameterized MAML, including explicitly identifying typical data and task distributions that can achieve diminishing generalization error with overparameterization, and characterizing the impact of adaptation learning rate on both excess risk and the early stopping time. Our theoretical findings are further validated by experiments.
翻译:尽管深层元学习取得了优异的经验,但是对超分化元学习的理论理解仍然有限。本文研究广泛使用的超分化元学习方法(模型-不可知元学习(MAML))的概括化,该方法旨在找到一种良好的初始化,以便快速适应新的任务。在混合线性回归模型下,我们分析了在超分化制度中接受过分化系统SGD培训的MAML的概括化特性。我们为MAML的超重风险提供了上下限,它反映了SGD动态如何影响这些一般化界限。有了这种清晰的定性,我们进一步探索了各种学习参数如何影响超分化的MAML的概括化能力,包括明确确定典型的数据和任务分布,从而能够通过超分化减少一般化错误,并将适应学习率对超重风险和早期停工时间的影响定性。我们的理论结论通过实验得到进一步验证。