An effective approach in meta-learning is to utilize multiple "train tasks" to learn a good initialization for model parameters that can help solve unseen "test tasks" with very few samples by fine-tuning from this initialization. Although successful in practice, theoretical understanding of such methods is limited. This work studies an important aspect of these methods: splitting the data from each task into train (support) and validation (query) sets during meta-training. Inspired by recent work (Raghu et al., 2020), we view such meta-learning methods through the lens of representation learning and argue that the train-validation split encourages the learned representation to be low-rank without compromising on expressivity, as opposed to the non-splitting variant that encourages high-rank representations. Since sample efficiency benefits from low-rankness, the splitting strategy will require very few samples to solve unseen test tasks. We present theoretical results that formalize this idea for linear representation learning on a subspace meta-learning instance, and experimentally verify this practical benefit of splitting in simulations and on standard meta-learning benchmarks.
翻译:在元培训中,一个有效的方法就是利用多种“培训任务”来学习一种良好的模式参数初始化,这种模型参数能够帮助通过微调从这一初始化过程中的微小样本解决不可见的“测试任务”。虽然在实践上是成功的,但对这些方法的理论理解是有限的。这项工作研究了这些方法的一个重要方面:将每个任务的数据分为培训(支助)和认证(查询)组。在近期工作(Raghu等人,2020年)的启发下,我们从代表性学习的角度来看待这种元学习方法,并争论说,培训校对的划分鼓励在不损及表达性的情况下将学习到低层次上,而不是鼓励高层次代表的非分裂变量。由于抽样效率的好处来自低层次,分裂战略将需要很少的样本来解决无形的测试任务。我们提出了理论结果,将这种概念正规化为在子空间的元学习实例上进行线性代表学习,并实验性地核实在模拟和标准元学习基准上进行分裂的这一实际好处。