Machine learning models have traditionally been developed under the assumption that the training and test distributions match exactly. However, recent success in few-shot learning and related problems are encouraging signs that these models can be adapted to more realistic settings where train and test distributions differ. Unfortunately, there is severely limited theoretical support for these algorithms and little is known about the difficulty of these problems. In this work, we provide novel information-theoretic lower-bounds on minimax rates of convergence for algorithms that are trained on data from multiple sources and tested on novel data. Our bounds depend intuitively on the information shared between sources of data, and characterize the difficulty of learning in this setting for arbitrary algorithms. We demonstrate these bounds on a hierarchical Bayesian model of meta-learning, computing both upper and lower bounds on parameter estimation via maximum-a-posteriori inference.
翻译:机械学习模式传统上是在培训和测试分布完全吻合的假设下开发的。然而,最近在一些短片学习和相关问题方面取得的成功是令人鼓舞的迹象,表明这些模型可以适应培训和测试分布不同的更现实的环境。不幸的是,对这些算法的理论支持非常有限,对这些问题的困难知之甚少。在这项工作中,我们提供了新颖的信息――理论――较低限制,说明根据多种来源的数据培训并用新数据测试的算法的微缩趋同率。我们的界限取决于数据来源之间共享的信息,并说明了在这一环境中任意算法学习的困难。我们用一种等级的贝叶学模式展示了这些界限,即通过最大误判计算参数估计的上限和下限。