Deep neural networks (DNNs) with the flexibility to learn good top-layer representations have eclipsed shallow kernel methods without that flexibility. Here, we take inspiration from DNNs to develop the deep kernel machine. Optimizing the deep kernel machine objective is equivalent to exact Bayesian inference (or noisy gradient descent) in an infinitely wide Bayesian neural network or deep Gaussian process, which has been scaled carefully to retain representation learning. Our work thus has important implications for theoretical understanding of neural networks. In addition, we show that the deep kernel machine objective has more desirable properties and better performance than other choices of objective. Finally, we conjecture that the deep kernel machine objective is unimodal. We give a proof of unimodality for linear kernels, and a number of experiments in the nonlinear case in which all deep kernel machines initializations we tried converged to the same solution.
翻译:深度神经网络(DNNs)具有学习良好的顶层表现的灵活性,它已经侵蚀了浅层内核方法,而没有这种灵活性。在这里,我们从DNNs得到灵感来开发深内核机器。优化深内核机器的目标相当于一个无限宽广的贝亚神经网络或深高萨进程中精确的贝亚内核推断(或噪音梯度下降),这个过程已经谨慎地扩大,以保留代号学习。因此,我们的工作对神经网络的理论理解有着重要影响。此外,我们还表明深内核机器目标比其他目标选择更可取的特性和更好的性能。最后,我们推测深内核机器的目标是非形式化的。我们证明了线性内核的单一性,以及在非线性案例中的一些实验,我们尝试的所有深内核机械初始化都与同一解决方案相融合。