In this work, we propose a novel generative model for mapping inputs to structured, high-dimensional outputs using structured conditional normalizing flows and Gaussian process regression. The model is motivated by the need to characterize uncertainty in the input/output relationship when making inferences on new data. In particular, in the physical sciences, limited training data may not adequately characterize future observed data; it is critical that models adequately indicate uncertainty, particularly when they may be asked to extrapolate. In our proposed model, structured conditional normalizing flows provide parsimonious latent representations that relate to the inputs through a Gaussian process, providing exact likelihood calculations and uncertainty that naturally increases away from the training data inputs. We demonstrate the methodology on laser-induced breakdown spectroscopy data from the ChemCam instrument onboard the Mars rover Curiosity. ChemCam was designed to recover the chemical composition of rock and soil samples by measuring the spectral properties of plasma atomic emissions induced by a laser pulse. We show that our model can generate realistic spectra conditional on a given chemical composition and that we can use the model to perform uncertainty quantification of chemical compositions for new observed spectra. Based on our results, we anticipate that our proposed modeling approach may be useful in other scientific domains with high-dimensional, complex structure where it is important to quantify predictive uncertainty.
翻译:在这项工作中,我们提出了一个新型的基因模型,用于利用结构化的有条件的正常流动和高斯进程回归,将投入映射到结构化的高维产出中,使用结构化的、有条件的正常流流和高斯进程回归;该模型的动因是,在对新数据作出推断时,需要将投入/产出关系中的不确定性定性;特别是,在物理科学方面,有限的培训数据可能不足以充分描述未来观测到的数据;至关重要的是,模型应充分显示不确定性,特别是当可能要求它们推断出激光脉冲所引发的等离子原子排放的光谱特性时;在我们拟议的模型中,结构有结构化,提供精确的可能性计算和不确定性,这种可能性与培训数据投入自然增加有关;我们展示了ChemCam仪器在输入火星光谱库里欧时,以激光诱导导导导导导导导的化学和土壤样品的光谱学数据数据采集的不确定性;ChemCam旨在通过测量激光脉冲所引发的等离子原子排放的光谱特性,恢复岩石和土壤样本的化学成分的化学成分的化学特性;我们可以用模型模型来进行精确的测定,我们用模型来进行模型来对化学结构进行精确的预测。