Machine learning offers an exciting opportunity to improve the calibration of nearly all reconstructed objects in high-energy physics detectors. However, machine learning approaches often depend on the spectra of examples used during training, an issue known as prior dependence. This is an undesirable property of a calibration, which needs to be applicable in a variety of environments. The purpose of this paper is to explicitly highlight the prior dependence of some machine learning-based calibration strategies. We demonstrate how some recent proposals for both simulation-based and data-based calibrations inherit properties of the sample used for training, which can result in biases for downstream analyses. In the case of simulation-based calibration, we argue that our recently proposed Gaussian Ansatz approach can avoid some of the pitfalls of prior dependence, whereas prior-independent data-based calibration remains an open problem.
翻译:机器学习为改进在高能物理探测器中几乎所有重造物体的校准提供了一个令人振奋的机会,然而,机器学习方法往往取决于培训期间使用的例子的光谱,即以前依赖的问题。这是一个校准的不可取的特性,需要适用于各种环境。本文的目的是明确强调某些基于机器的校准战略的先前依赖性。我们展示了最近一些关于模拟和基于数据的校准的建议如何继承了用于培训的样本的特性,这可能导致下游分析的偏向。在模拟校准方面,我们认为我们最近提出的Gaussian Ansatz方法可以避免以前依赖性的一些陷阱,而以前依赖数据的校准仍是一个尚未解决的问题。