In this paper, we study the generalization properties of Model-Agnostic Meta-Learning (MAML) algorithms for supervised learning problems. We focus on the setting in which we train the MAML model over $m$ tasks, each with $n$ data points, and characterize its generalization error from two points of view: First, we assume the new task at test time is one of the training tasks, and we show that, for strongly convex objective functions, the expected excess population loss is bounded by ${\mathcal{O}}(1/mn)$. Second, we consider the MAML algorithm's generalization to an unseen task and show that the resulting generalization error depends on the total variation distance between the underlying distributions of the new task and the tasks observed during the training process. Our proof techniques rely on the connections between algorithmic stability and generalization bounds of algorithms. In particular, we propose a new definition of stability for meta-learning algorithms, which allows us to capture the role of both the number of tasks $m$ and number of samples per task $n$ on the generalization error of MAML.
翻译:在本文中,我们研究了模型-不可知元学习(MAML)算法对于受监督的学习问题的概括性特性。我们侧重于对MAML模型进行超过百万美元任务(每个任务都有美元数据点)的培训,并从两个角度来描述其概括性错误:首先,我们在试验时承担新的任务是培训任务之一,我们表明,对于强烈的曲线客观功能,预期的超额人口损失受$_mathcal{O ⁇ }(1/mn)美元的约束。第二,我们考虑MAML算法的概括性任务,并表明由此产生的一般性错误取决于新任务的基本分布与培训过程中所观察到的任务之间的总差异距离。我们的证据技术依赖于算法稳定性与算法的概括性界限之间的联系。特别是,我们提出了关于元学习算法算法的稳定性的新定义,使我们能够掌握任务数量和每个任务样本数量(美元)对于MAMML一般化错误的作用。