In this work, we analyze the role of the network architecture in shaping the inductive bias of deep classifiers. To that end, we start by focusing on a very simple problem, i.e., classifying a class of linearly separable distributions, and show that, depending on the direction of the discriminative feature of the distribution, many state-of-the-art deep convolutional neural networks (CNNs) have a surprisingly hard time solving this simple task. We then define as neural anisotropy directions (NADs) the vectors that encapsulate the directional inductive bias of an architecture. These vectors, which are specific for each architecture and hence act as a signature, encode the preference of a network to separate the input data based on some particular features. We provide an efficient method to identify NADs for several CNN architectures and thus reveal their directional inductive biases. Furthermore, we show that, for the CIFAR-10 dataset, NADs characterize the features used by CNNs to discriminate between different classes.
翻译:在这项工作中,我们分析网络结构在形成深层分类者的感化偏差方面的作用。为此,我们首先关注一个非常简单的问题,即对线性分离分布的类别进行分类,并表明,根据分布的区别性特征,许多最先进的深层共振神经网络(CNNs)在解决这一简单任务方面有着令人惊讶的困难时间。然后,我们将囊括结构方向性偏差的矢量定义为神经性厌性方向(NADs ) 。这些矢量是每个结构特有的,因此可以作为签名,说明网络倾向于根据某些特定特征分离输入数据。我们提供了一种有效的方法,为一些CNN结构识别NADs,从而揭示其方向性诱导偏差。此外,我们表明,对于CIFAR-10数据集来说,NADs是CNC用于区分不同类别的特点。