Symmetry has been a fundamental tool in the exploration of a broad range of complex systems. In machine learning, symmetry has been explored in both models and data. In this paper we seek to connect the symmetries arising from the architecture of a family of models with the symmetries of that family's internal representation of data. We do this by calculating a set of fundamental symmetry groups, which we call the \emph{intertwiner groups} of the model. Each of these arises from a particular nonlinear layer of the model and different nonlinearities result in different symmetry groups. These groups change the weights of a model in such a way that the underlying function that the model represents remains constant but the internal representations of data inside the model may change. We connect intertwiner groups to a model's internal representations of data through a range of experiments that probe similarities between hidden states across models with the same architecture. Our work suggests that the symmetries of a network are propagated into the symmetries in that network's representation of data, providing us with a better understanding of how architecture affects the learning and prediction process. Finally, we speculate that for ReLU networks, the intertwiner groups may provide a justification for the common practice of concentrating model interpretability exploration on the activation basis in hidden layers rather than arbitrary linear combinations thereof.
翻译:对称是探索广泛复杂系统的基本工具。 在机器学习中, 对称在模型和数据中都进行了探索。 在本文中, 我们试图将模型大家庭结构中产生的对称与该家庭内部数据表述的对称联系起来。 我们通过计算一系列基本对称组来做到这一点, 我们称之为模型的对称组。 每一种对称都来自一个特定的非线性模型层和不同的非线性, 导致不同的对称组。 这些组改变模型的权重, 使模型所代表的基本功能保持不变, 但模型内数据的内部描述可能发生变化。 我们通过一系列实验将互换组与模型内部的数据描述联系起来, 我们通过这些实验来探究模型中不同模型之间隐藏状态的相似性。 我们的工作表明, 网络的对称性在网络的对称中传播到对称中, 导致不同的对称组。 这些组改变模型的权重, 使模型所代表的基本功能保持不变, 但模型的内部描述可能改变。 我们的对共同的对等结构的解读, 提供我们共同的对等的对立性分析, 从而更好地了解共同的对等结构的对等结构进行我们的研究。