深核心学习的诺言和空洞 (The Promises and Pitfalls of Deep Kernel Learning)

Deep kernel learning and related techniques promise to combine the representational power of neural networks with the reliable uncertainty estimates of Gaussian processes. One crucial aspect of these models is an expectation that, because they are treated as Gaussian process models optimized using the marginal likelihood, they are protected from overfitting. However, we identify pathological behavior, including overfitting, on a simple toy example. We explore this pathology, explaining its origins and considering how it applies to real datasets. Through careful experimentation on UCI datasets, CIFAR-10, and the UTKFace dataset, we find that the overfitting from overparameterized deep kernel learning, in which the model is "somewhat Bayesian", can in certain scenarios be worse than that from not being Bayesian at all. However, we find that a fully Bayesian treatment of deep kernel learning can rectify this overfitting and obtain the desired performance improvements over standard neural networks and Gaussian processes.

翻译：深心学习和相关技术有望将神经网络的代表性力量与Gaussian过程的可靠不确定性估计值结合起来。这些模型的一个重要方面是预期,由于这些模型被作为利用边际可能性优化的Gaussian过程模型来对待,因此它们不会被过度使用。然而,我们用一个简单的玩具来辨别病理行为,包括过度装配。我们探索了这种病理学,解释了其起源并考虑它如何适用于真实的数据集。通过对UCI数据集、CIFAR-10和UTKFace数据集的仔细实验,我们发现,在过度分解的深内核学习中,该模型是“某种贝叶斯人”的模型,在特定情况下可能比根本不是巴伊斯人的情况还要糟糕。然而,我们发现,对深内核学习的完全巴伊斯治疗能够纠正这一过度现象,并在标准神经网络和Gausian进程中得到预期的性能改进。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

专知会员服务

29+阅读 · 2019年12月19日

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

专知会员服务

45+阅读 · 2019年11月26日