Linear neural network layers that are either equivariant or invariant to permutations of their inputs form core building blocks of modern deep learning architectures. Examples include the layers of DeepSets, as well as linear layers occurring in attention blocks of transformers and some graph neural networks. The space of permutation equivariant linear layers can be identified as the invariant subspace of a certain symmetric group representation, and recent work parameterized this space by exhibiting a basis whose vectors are sums over orbits of standard basis elements with respect to the symmetric group action. A parameterization opens up the possibility of learning the weights of permutation equivariant linear layers via gradient descent. The space of permutation equivariant linear layers is a generalization of the partition algebra, an object first discovered in statistical physics with deep connections to the representation theory of the symmetric group, and the basis described above generalizes the so-called orbit basis of the partition algebra. We exhibit an alternative basis, generalizing the diagram basis of the partition algebra, with computational benefits stemming from the fact that the tensors making up the basis are low rank in the sense that they naturally factorize into Kronecker products. Just as multiplication by a rank one matrix is far less expensive than multiplication by an arbitrary matrix, multiplication with these low rank tensors is far less expensive than multiplication with elements of the orbit basis. Finally, we describe an algorithm implementing multiplication with these basis elements.
翻译:线性线性网络的层,可以是等式的,也可以是变异的,也可以是变异的线性神经网络的层,是现代深层学习结构的核心构件。例子包括深Set的层,以及在变压器和一些图形神经网络的注意区块中出现的线性层。变异性线性线性层的空间可以被确定为某对称组代表性的内在空间,最近的工作参数通过展示矢量在对称组动作方面超过标准基元素的轨道数的基数,来显示该空间的矢量在对称组的多基数轨道上的总和。参数的参数化开启了通过梯度下降来学习变异性等线性线性线性层重量的可能性。变异性线性线性层空间是分区变异性平的概括空间,在统计物理学中首先发现的一个物体与对称组的表达理论有深度联系,而上文所描述的基础则概括了对称的轨道基础。我们展示了一个替代基础,通过渐渐渐渐变的地平位数性平标的图表基础,在一种直径基的轨道上,从一个直径直径直径基上,从一个直判基础中推一个直判的推一个直判基础,从一个直判的推一个直判基数基础,从一个直判基数性推算法性推算法,从一个直判的数性基数性基数性推算,从一个直算的数性推算法,从一个直算法,从一个直算的数性推算法基础,从一个直算法性推算入为一个直算法基础,从一个直为一个直为一个直为一个直为一个直算法,从一个直为一个直为一个直算基础,从一个直算基础,从一个直判的数性推算基础,从一个直算基础,从一个直算基础,从一个直算基础,从一个直算的数性推算算算为一个直为一个深算算的数性推算基础,从一个深。</s>