关于元学习中培训-认证分拆重要性的 " 代表性学习 " 观点 (A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning)

An effective approach in meta-learning is to utilize multiple "train tasks" to learn a good initialization for model parameters that can help solve unseen "test tasks" with very few samples by fine-tuning from this initialization. Although successful in practice, theoretical understanding of such methods is limited. This work studies an important aspect of these methods: splitting the data from each task into train (support) and validation (query) sets during meta-training. Inspired by recent work (Raghu et al., 2020), we view such meta-learning methods through the lens of representation learning and argue that the train-validation split encourages the learned representation to be low-rank without compromising on expressivity, as opposed to the non-splitting variant that encourages high-rank representations. Since sample efficiency benefits from low-rankness, the splitting strategy will require very few samples to solve unseen test tasks. We present theoretical results that formalize this idea for linear representation learning on a subspace meta-learning instance, and experimentally verify this practical benefit of splitting in simulations and on standard meta-learning benchmarks.

翻译：在元培训中,一个有效的方法就是利用多种“培训任务”来学习一种良好的模式参数初始化,这种模型参数能够帮助通过微调从这一初始化过程中的微小样本解决不可见的“测试任务”。虽然在实践上是成功的,但对这些方法的理论理解是有限的。这项工作研究了这些方法的一个重要方面:将每个任务的数据分为培训(支助)和认证(查询)组。在近期工作(Raghu等人,2020年)的启发下,我们从代表性学习的角度来看待这种元学习方法,并争论说,培训校对的划分鼓励在不损及表达性的情况下将学习到低层次上,而不是鼓励高层次代表的非分裂变量。由于抽样效率的好处来自低层次,分裂战略将需要很少的样本来解决无形的测试任务。我们提出了理论结果,将这种概念正规化为在子空间的元学习实例上进行线性代表学习,并实验性地核实在模拟和标准元学习基准上进行分裂的这一实际好处。

相关内容

表示学习

关注 186

表示学习是通过利用训练数据来学习得到向量表示，这可以克服人工方法的局限性。表示学习通常可分为两大类，无监督和有监督表示学习。大多数无监督表示学习方法利用自动编码器（如去噪自动编码器和稀疏自动编码器等）中的隐变量作为表示。目前出现的变分自动编码器能够更好的容忍噪声和异常值。然而，推断给定数据的潜在结构几乎是不可能的。目前有一些近似推断的策略。此外，一些无监督表示学习方法旨在近似某种特定的相似性度量。提出了一种无监督的相似性保持表示学习框架，该框架使用矩阵分解来保持成对的DTW相似性。通过学习保持DTW的shaplets，即在转换后的空间中的欧式距离近似原始数据的真实DTW距离。有监督表示学习方法可以利用数据的标签信息，更好地捕获数据的语义结构。孪生网络和三元组网络是目前两种比较流行的模型，它们的目标是最大化类别之间的距离并最小化了类别内部的距离。

【ICML2021】核持续学习，Kernel Continual Learning

专知会员服务

32+阅读 · 2021年7月15日

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知会员服务

80+阅读 · 2020年6月11日