Many datasets in scientific and engineering applications are comprised of objects which have specific geometric structure. A common example is data which inhabits a representation of the group SO$(3)$ of 3D rotations: scalars, vectors, tensors, \textit{etc}. One way for a neural network to exploit prior knowledge of this structure is to enforce SO$(3)$-equivariance throughout its layers, and several such architectures have been proposed. While general methods for handling arbitrary SO$(3)$ representations exist, they computationally intensive and complicated to implement. We show that by judicious symmetry breaking, we can efficiently increase the expressiveness of a network operating only on vector and order-2 tensor representations of SO$(2)$. We demonstrate the method on an important problem from High Energy Physics known as \textit{b-tagging}, where particle jets originating from b-meson decays must be discriminated from an overwhelming QCD background. In this task, we find that augmenting a standard architecture with our method results in a \ensuremath{2.3\times} improvement in rejection score.
翻译:-----
许多科学和工程应用中的数据由具有特定几何结构的对象组成。一个常见的例子是居于 3D 旋转群 SO(3) 表示的数据:标量,向量,张量等。神经网络运用先前的这种结构信息的一种方式是在其层中强制实现 SO(3)-等变性,已有多种此类架构被提出。虽然存在用于处理任意 SO(3) 表示的通用方法,但它们计算量大而且复杂难以实现。我们展示了通过刻意对称破缺,可以在仅操作 SO(2) 的向量和二阶张量表示的网络中有效地提高其表达能力。我们在一个称为“b-tagging”的重要高能物理问题中演示了该方法,在该任务中,需要将起源于 b 介子衰变的粒子喷注与压倒性的 QCD 背景区分出来。在这个任务中,我们发现使用我们的方法增强标准架构可以提高 2.3 倍的抑制得分。