Machine learning problems have an intrinsic geometric structure as central objects including a neural network's weight space and the loss function associated with a particular task can be viewed as encoding the intrinsic geometry of a given machine learning problem. Therefore, geometric concepts can be applied to analyze and understand theoretical properties of machine learning strategies as well as to develop new algorithms. In this paper, we address three seemingly unrelated open questions in machine learning by viewing them through a unified framework grounded in differential geometry. Specifically, we view the weight space of a neural network as a manifold endowed with a Riemannian metric that encodes performance on specific tasks. By defining a metric, we can construct geodesic, minimum length, paths in weight space that represent sets of networks of equivalent or near equivalent functional performance on a specific task. We, then, traverse geodesic paths while identifying networks that satisfy a second objective. Inspired by the geometric insight, we apply our geodesic framework to 3 major applications: (i) Network sparsification (ii) Mitigating catastrophic forgetting by constructing networks with high performance on a series of objectives and (iii) Finding high-accuracy paths connecting distinct local optima of deep networks in the non-convex loss landscape. Our results are obtained on a wide range of network architectures (MLP, VGG11/16) trained on MNIST, CIFAR-10/100. Broadly, we introduce a geometric framework that unifies a range of machine learning objectives and that can be applied to multiple classes of neural network architectures.
翻译:机器学习问题具有内在的几何结构,作为核心对象,包括神经网络的重量空间和与特定任务相关的损失函数,可以视为对特定机器学习问题内在几何的编码。因此,可以应用几何概念来分析和理解机器学习战略的理论特性,并开发新的算法。在本文中,我们通过基于不同几何的统一框架来看待机器学习中三个似乎无关的开放问题。具体地,我们认为神经网络的重量空间是一个配有里曼尼指标的元件,该指标能为具体任务的业绩编码。通过确定一个指标,我们可以在重量空间中建造大地学、最小长度、长路径,代表一个相当于或接近相同功能性能的网络,从而在特定任务中,我们用偏斜的地理学路径来找出满足第二个目标的网络。在几何深深的洞洞观察中,我们把我们的地理德性框架应用到3个主要应用程序:(一) 网络抽调(二) 通过在一系列目标上建立高性能编码的网络来打乱。 (三) 在重的地理结构中,找到高额的网络中,我们深层次的网络的路径可以连接。