SPANet: 利用对称保持注意对粒子物理进行通用的无变异定型分配 (SPANet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention)

The creation of unstable heavy particles at the Large Hadron Collider is the most direct way to address some of the deepest open questions in physics. Collisions typically produce variable-size sets of observed particles which have inherent ambiguities complicating the assignment of observed particles to the decay products of the heavy particles. Current strategies for tackling these challenges in the physics community ignore the physical symmetries of the decay products and consider all possible assignment permutations and do not scale to complex configurations. Attention based deep learning methods for sequence modelling have achieved state-of-the-art performance in natural language processing, but they lack built-in mechanisms to deal with the unique symmetries found in physical set-assignment problems. We introduce a novel method for constructing symmetry-preserving attention networks which reflect the problem's natural invariances to efficiently find assignments without evaluating all permutations. This general approach is applicable to arbitrarily complex configurations and significantly outperforms current methods, improving reconstruction efficiency between 19\% - 35\% on typical benchmark problems while decreasing inference time by two to five orders of magnitude on the most complex events, making many important and previously intractable cases tractable. A full code repository containing a general library, the specific configuration used, and a complete dataset release, are avaiable at https://github.com/Alexanders101/SPANet

翻译：在大型高原对流器中创建不稳定重物颗粒是解决物理学中一些最深层开放问题的最直接方式。碰撞通常会产生不同尺寸的观测粒子,这些粒子具有内在的模糊性,使观测到的粒子分配给重粒的衰变产物更为复杂。目前在物理学界应对这些挑战的战略忽视了衰变产品的物理对称性,并审议了所有可能的对称性,而且不至于扩大到复杂的配置。基于测序模拟的深层次学习方法在自然语言处理中达到了最先进的性能,但它们缺乏处理在物理定型任务问题中发现的独特对称的内在机制。我们引入了一种新颖的方法,用于构建对称性偏重网络,这反映了问题的自然差异,以便在不评估所有变异的情况下高效地找到任务。这种一般方法适用于任意的复杂配置,大大超越了当前的方法,在典型的基准问题上提高了19-35 ⁇ 之间的重建效率,同时在最复杂的事件上减少2至5级的推论时间,使许多重要的和先前的对称性偏差/ 完全的内存式数据库, 一个重要和先前的可加固性数据库。一个完整的数据库, 全面的解式数据库, 一个用于一个精确的版本。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日