The successes of modern deep neural networks (DNNs) are founded on their ability to transform inputs across multiple layers to build good high-level representations. It is therefore critical to understand this process of representation learning. However, we cannot use standard theoretical approaches involving infinite width limits, as they eliminate representation learning. We therefore develop a new infinite width limit, the representation learning limit, that exhibits representation learning mirroring that in finite-width networks, yet at the same time, remains extremely tractable. For instance, the representation learning limit gives exactly multivariate Gaussian posteriors in deep Gaussian processes with a wide range of kernels, including all isotropic (distance-dependent) kernels. We derive an elegant objective that describes how each network layer learns representations that interpolate between input and output. Finally, we use this limit and objective to develop a flexible, deep generalisation of kernel methods, that we call deep kernel machines (DKMs). We show that DKMs can be scaled to large datasets using methods inspired by inducing point methods from the Gaussian process literature, and we show that DKMs exhibit superior performance to other kernel-based approaches.
翻译:现代深心神经网络(DNNs)的成功基础在于它们能够将投入转换到多个层次,以建立良好的高层代表制,因此理解这一代表性学习过程至关重要。然而,我们不能使用涉及无限宽度的标准理论方法,因为它们消除了代表性学习。因此,我们开发了一个新的无限宽度限制,即代表学习限制,展示了体现在有限宽度网络中,但与此同时仍然极易移动的学习方法。例如,代表性学习限制使深高地高斯进程中的高斯山后座完全具有多变性,具有广泛的内核,包括所有异性(依赖远程的)核心。我们提出了一个优雅的目标,描述了每个网络层如何学习在投入和产出之间相互推导的表达。最后,我们使用这一限制和目标来发展一种灵活、深入的内核方法,我们称之为深内核机器(DKMs)。我们表明,DKMs可以借助从戈西亚进程文献引导出点方法到其他高级的演示,我们展示DKMs。