Encoding symmetries is a powerful inductive bias for improving the generalization of deep neural networks. However, most existing equivariant models are limited to simple symmetries like rotations, failing to address the broader class of general linear transformations, GL(n), that appear in many scientific domains. We introduce Reductive Lie Neurons (ReLNs), a novel neural network architecture exactly equivariant to these general linear symmetries. ReLNs are designed to operate directly on a wide range of structured inputs, including general n-by-n matrices. ReLNs introduce a novel adjoint-invariant bilinear layer to achieve stable equivariance for both Lie-algebraic features and matrix-valued inputs, without requiring redesign for each subgroup. This architecture overcomes the limitations of prior equivariant networks that only apply to compact groups or simple vector data. We validate ReLNs' versatility across a spectrum of tasks: they outperform existing methods on algebraic benchmarks with sl(3) and sp(4) symmetries and achieve competitive results on a Lorentz-equivariant particle physics task. In 3D drone state estimation with geometric uncertainty, ReLNs jointly process velocities and covariances, yielding significant improvements in trajectory accuracy. ReLNs provide a practical and general framework for learning with broad linear group symmetries on Lie algebras and matrix-valued data. Project page: https://reductive-lie-neuron.github.io/
翻译:编码对称性是一种强大的归纳偏置,可有效提升深度神经网络的泛化能力。然而,现有的大多数等变模型仅限于旋转等简单对称性,未能涵盖众多科学领域中出现的更广泛的一般线性变换群GL(n)。本文提出了Reductive Lie Neurons(ReLNs),一种对这类一般线性对称性严格等变的新型神经网络架构。ReLNs被设计为可直接处理包括一般n×n矩阵在内的多种结构化输入。通过引入新颖的伴随不变双线性层,ReLNs实现了对李代数特征与矩阵值输入的稳定等变性,无需为每个子群重新设计架构。该架构克服了先前等变网络仅适用于紧致群或简单向量数据的局限性。我们在多项任务中验证了ReLNs的通用性:在具有sl(3)和sp(4)对称性的代数基准测试中,其性能优于现有方法;在洛伦兹等变粒子物理任务中取得了具有竞争力的结果。在具有几何不确定性的三维无人机状态估计任务中,ReLNs联合处理速度与协方差矩阵,显著提升了轨迹精度。ReLNs为在李代数与矩阵值数据上学习广泛线性群对称性提供了一个实用且通用的框架。项目页面:https://reductive-lie-neuron.github.io/