A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, \dots, x_{i, N_i}$ in a class $C_i$ are mapped to a single point $y_i$ by $f$ and the points $y_i$ are located at the vertices of a regular $k-1$-dimensional tetrahedron in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_i$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration).
翻译:最近的一项数字研究发现,在倒数第二层中,神经网络分类者享有很大程度的对称性。也就是说,如果美元(x)=Af(x)+B$,其中美元为线性地图,美元是网络倒数第二层的输出(激活后),那么所有数据点(x ⁇ i,1},\dots,x ⁇ i,N_i}美元在1美元的C美元中都具有高度对称性。如果美元是浅网络(或者如果深层没有将数据样品纳入方便的几何结构),那么美元(y_i)位于一个普通的美元-美元-美元-维四重体空间的顶部。我们用显性很强的深神经网络透视模型来分析这一观察。在补充的例子中,我们有力地证明,即使分类的最后输出值$(h)在从1美元的等级数据样品上也不一致。