Deep neural networks implement a sequence of layer-by-layer operations that are each relatively easy to understand, but the resulting overall computation is generally difficult to understand. We develop a simple idea for interpreting the layer-by-layer construction of useful representations: the role of each layer is to reformat information to reduce the "distance" to the target outputs. We formalize this intuitive idea of "distance" by leveraging recent work on metric representational similarity, and show how it leads to a rich space of geometric concepts. With this framework, the layer-wise computation implemented by a deep neural network can be viewed as a path in a high-dimensional representation space. We develop tools to characterize the geometry of these in terms of distances, angles, and geodesics. We then ask three sets of questions of residual networks trained on CIFAR-10: (1) how straight are paths, and how does each layer contribute towards the target? (2) how do these properties emerge over training? and (3) how similar are the paths taken by wider versus deeper networks? We conclude by sketching additional ways that this kind of representational geometry can be used to understand and interpret network training, or to prescriptively improve network architectures to suit a task.
翻译:深心神经网络实施了一系列逐层操作,这些操作相对容易理解,但由此得出的总体计算一般难以理解。我们为逐层构建有用的表达方式提出了一个简单的解释理念:每一层的作用是重新改造信息,以减少“距离”到目标产出。我们通过利用最近关于计量相似性的工作,将这种“距离”的直观想法正式化,并展示它如何导致一个丰富的几何概念空间。有了这个框架,深神经网络实施的从层到层的计算方法可以被视为高维代表空间的一种路径。我们开发了从距离、角度和大地测量学角度描述这些表达方式的几何性的工具。然后我们问了三组在CIFAR-10上培训过的剩余网络的问题:(1) 直线路径如何,以及每一层如何对目标作出贡献?(2) 这些属性如何在培训中出现?(3) 更宽、更深的网络所走的路径如何相似?我们通过勾画出更多的方法来将这种代表几何测量方法用来理解和解释网络结构,或改进任务结构。