Data augmentation is often used to incorporate inductive biases into models. Traditionally, these are hand-crafted and tuned with cross validation. The Bayesian paradigm for model selection provides a path towards end-to-end learning of invariances using only the training data, by optimising the marginal likelihood. We work towards bringing this approach to neural networks by using an architecture with a Gaussian process in the last layer, a model for which the marginal likelihood can be computed. Experimentally, we improve performance by learning appropriate invariances in standard benchmarks, the low data regime and in a medical imaging task. Optimisation challenges for invariant Deep Kernel Gaussian processes are identified, and a systematic analysis is presented to arrive at a robust training scheme. We introduce a new lower bound to the marginal likelihood, which allows us to perform inference for a larger class of likelihood functions than before, thereby overcoming some of the training challenges that existed with previous approaches.
翻译:数据扩增通常用于将感化偏差纳入模型。 传统上, 这些都是手工制作的, 并经过交叉验证。 贝叶西亚模式选择模式的范例通过优化边际可能性, 提供了只使用培训数据的端到端学习差异的方法, 优化了边际可能性 。 我们致力于通过在最后一层使用高斯进程的结构, 将这一方法引入神经网络, 这个模型可以计算出边际可能性 。 实验中, 我们通过学习标准基准、 低数据制度和医学成像任务中的适当差异来改进绩效。 找出了差异性深海高斯进程的最佳化挑战, 并提出了系统分析, 以形成一个强大的培训计划。 我们引入了一个新的边际可能性的下限, 从而使我们能够对比以前更大规模的可能性功能进行推断, 从而克服了以前方法中存在的一些培训挑战 。