In collider-based particle and nuclear physics experiments, data are produced at such extreme rates that only a subset can be recorded for later analysis. Typically, algorithms select individual collision events for preservation and store the complete experimental response. A relatively new alternative strategy is to additionally save a partial record for a larger subset of events, allowing for later specific analysis of a larger fraction of events. We propose a strategy that bridges these paradigms by compressing entire events for generic offline analysis but at a lower fidelity. An optimal-transport-based $\beta$ Variational Autoencoder (VAE) is used to automate the compression and the hyperparameter $\beta$ controls the compression fidelity. We introduce a new approach for multi-objective learning functions by simultaneously learning a VAE appropriate for all values of $\beta$ through parameterization. We present an example use case, a di-muon resonance search at the Large Hadron Collider (LHC), where we show that simulated data compressed by our $\beta$-VAE has enough fidelity to distinguish distinct signal morphologies.
翻译:在以对流器为基础的粒子和核物理实验中,数据的生成速度极快,只能记录一个子集,供日后分析。一般情况下,算法选择单个碰撞事件以保存和存储完整的实验反应。较新的替代战略是额外保存更多子事件的部分记录,允许以后对较大部分事件进行具体分析。我们提出了一个战略,通过压缩整个事件进行一般离线分析,但在低忠诚度情况下,将这些范式连接起来。一个基于最佳运输的 $\beta$ variational Autencoder(VAE) 用于自动压缩,而超参数$\beta$控制压缩的忠诚性。我们引入了多目标学习功能的新方法,同时通过参数化学习适合$\beta$所有价值的VAE。我们举了一个例子,在大型哈德伦对座对座机(LHC)进行二模共振再共振研究,我们在那里显示,我们的美元-VAE美元模拟压缩数据具有足够的真实性,足以辨别不同的信号形态。