What is the best way to exploit extra data -- be it unlabeled data from the same task, or labeled data from a related task -- to learn a given task? This paper formalizes the question using the theory of reference priors. Reference priors are objective, uninformative Bayesian priors that maximize the mutual information between the task and the weights of the model. Such priors enable the task to maximally affect the Bayesian posterior, e.g., reference priors depend upon the number of samples available for learning the task and for very small sample sizes, the prior puts more probability mass on low-complexity models in the hypothesis space. This paper presents the first demonstration of reference priors for medium-scale deep networks and image-based data. We develop generalizations of reference priors and demonstrate applications to two problems. First, by using unlabeled data to compute the reference prior, we develop new Bayesian semi-supervised learning methods that remain effective even with very few samples per class. Second, by using labeled data from the source task to compute the reference prior, we develop a new pretraining method for transfer learning that allows data from the target task to maximally affect the Bayesian posterior. Empirical validation of these methods is conducted on image classification datasets. Code is available at https://github.com/grasp-lyrl/deep_reference_priors.
翻译:利用额外数据的最佳方式是什么? 无论是来自同一任务的无标签数据,还是来自相关任务的标签数据, 学习一个特定任务? 本文使用参考前程理论来正式确定问题。 参考前前置是客观的, 没有信息化的巴耶西亚前置, 使任务和模型重量之间的相互信息最大化。 这些前置使任务能够最大程度地影响巴耶西亚后载体, 例如, 参考前置方法取决于可用于学习任务和非常小的样本大小的样本数量, 之前的假设空间中低兼容性模型的概率质量更高。 本文首次演示中等深度网络和基于图像的数据的参考前置前置。 我们开发了参考前置的通用, 并演示了两个问题。 首先, 通过使用无标签的数据来计算之前的引用, 我们开发了新的巴耶斯半受监督的学习方法, 即使每类的样本很少, 也依然有效。 其次, 使用来源任务中的标签数据来测量前置引用之前的参考模型, 我们开发了一个新的BAregres_ ligregal 。 将数据转换为这些用于正在测试的模型。 正在学习的初始化工具 。 将数据转换为 将数据转换为 正在进行 正在进行测试的 。