The Lipschitz constant of the map between the input and output space represented by a neural network is a natural metric for assessing the robustness of the model. We present a new method to constrain the Lipschitz constant of dense deep learning models that can also be generalized to other architectures. The method relies on a simple weight normalization scheme during training that ensures the Lipschitz constant of every layer is below an upper limit specified by the analyst. A simple monotonic residual connection can then be used to make the model monotonic in any subset of its inputs, which is useful in scenarios where domain knowledge dictates such dependence. Examples can be found in algorithmic fairness requirements or, as presented here, in the classification of the decays of subatomic particles produced at the CERN Large Hadron Collider. Our normalization is minimally constraining and allows the underlying architecture to maintain higher expressiveness compared to other techniques which aim to either control the Lipschitz constant of the model or ensure its monotonicity. We show how the algorithm was used to train a powerful, robust, and interpretable discriminator for heavy-flavor-quark decays, which has been adopted for use as the primary data-selection algorithm in the LHCb real-time data-processing system in the current LHC data-taking period known as Run 3. In addition, our algorithm has also achieved state-of-the-art performance on benchmarks in medicine, finance, and other applications.
翻译:以神经网络为代表的输入空间和输出空间之间的Lipschitz 地图的Lipschitz常数是评估模型稳健性的一种自然衡量标准。 我们提出了一个新方法来限制Lipschitz常数的密集深层学习模型的常数,这些模型也可以推广到其他结构。 这种方法在培训期间依赖于简单的重力正常化计划,确保每个层的Lipschitz常数低于分析师规定的上限。 然后, 一个简单的单调残余连接可以用来在其输入的任何子集中制造模型单调, 这在域知识决定这种依赖性的情况下是有用的。 实例可以在算法公正要求中找到,或者如这里所介绍的那样,在CERN 大哈德伦对生成的亚原子粒子腐蚀的分类中找到。 我们的正常化只是最低限度的制约,使基本架构保持更高的表达度,而其他技术的目的是控制模型的Lipschitzitz常数的常数或确保其单调性。 我们的算法是如何用来训练一个强大、稳健和可解释的医学辨别的辨别器, 重压-qual 腐烂的腐烂的理, 也用于我们当前运行- HC- hassal- sal- sal- sal- sal- salking- sligal- sal- sal- sal- laction- salvical- sal be laction- laction- salviolviolviolviolviolviolviolvicaldaldaldalvicaldaldal 数据系统, 数据系统, 和制制成为我们使用其他数据使用其他数据系统。</s>