Model-Agnostic Meta-Learning (MAML) and its variants have achieved success in meta-learning tasks on many datasets and settings. On the other hand, we have just started to understand and analyze how they are able to adapt fast to new tasks. For example, one popular hypothesis is that the algorithms learn good representations for transfer, as in multi-task learning. In this work, we contribute by providing a series of empirical and theoretical studies, and discover several interesting yet previously unknown properties of the algorithm. We find MAML adapts better with a deep architecture even if the tasks need only a shallow one (and thus, no representation learning is needed). While echoing previous findings by others that the bottom layers in deep architectures enable representation learning, we also find that upper layers enable fast adaptation by being meta-learned to perform adaptive gradient update when generalizing to new tasks. Motivated by these findings, we study several meta-optimization approaches and propose a new one for learning to optimize adaptively. Those approaches attain stronger performance in meta-learning both shallower and deeper architectures than MAML.
翻译:模型- 不可知元学习( MAML) 及其变体在许多数据集和设置的元化学习任务中取得了成功。 另一方面, 我们刚刚开始理解和分析它们如何能够快速适应新任务。 例如, 一个流行的假设是, 算法学会了良好的转移表现, 就象多任务学习一样。 在这项工作中, 我们通过提供一系列的经验和理论研究来作出贡献, 并发现算法中一些以前未知的有趣的特性。 我们发现, MAML 更适合深层结构, 即使任务只需要浅层( 因而不需要代议式学习 ) 。 虽然我们和以前的其他发现一样, 深层结构的底层可以促进代议学习。 我们还发现, 上层可以通过元学习来进行适应性梯度更新, 从而在将新任务概括化时进行适应性梯度更新。 我们根据这些发现, 研究几种元- 最佳化方法, 并提出一种新的学习优化适应性的方法。 这些方法在元化学习中取得了比 MAML 更浅和更深层结构的更高性。