In complex large-scale systems such as climate, important effects are caused by a combination of confounding processes that are not fully observable. The identification of sources from observations of system state is vital for attribution and prediction, which inform critical policy decisions. The difficulty of these types of inverse problems lies in the inability to isolate sources and the cost of simulating computational models. Surrogate models may enable the many-query algorithms required for source identification, but data challenges arise from high dimensionality of the state and source, limited ensembles of costly model simulations to train a surrogate model, and few and potentially noisy state observations for inversion due to measurement limitations. The influence of auxiliary processes adds an additional layer of uncertainty that further confounds source identification. We introduce a framework based on (1) calibrating deep neural network surrogates to the flow maps provided by an ensemble of simulations obtained by varying sources, and (2) using these surrogates in a Bayesian framework to identify sources from observations via optimization. Focusing on an atmospheric dispersion exemplar, we find that the expressive and computationally efficient nature of the deep neural network operator surrogates in appropriately reduced dimension allows for source identification with uncertainty quantification using limited data. Introducing a variable wind field as an auxiliary process, we find that a Bayesian approximation error approach is essential for reliable source inversion when uncertainty due to wind stresses the algorithm.
翻译:在复杂的大规模系统(如气候)中,重要效应是由不完全可观察到的混杂过程的组合所引起的。从观察到的系统状态中识别来源对于归因和预测非常重要,这些信息对关键政策决策起着决定性的作用。这些类型反问题的困难在于无法分离出来源,同时又面临着计算模型仿真的高成本。代理模型可以为源识别提供必要的多次查询算法,但由于状态和源的高维度,有限的代理模型的输入数据集和由于测量限制而导致的有限,潜在嘈杂的状态观测数据存在数据挑战。辅助过程的影响添加了一层不确定性,进一步使源识别更具混淆性。我们提出了一种基于(1)根据多个模拟得到的流场图像将深度神经网络代理调整到所需的计算精度水平,并且(2)在贝叶斯框架下使用这些代理对状态进行优化的框架。这样,就可以在有限数据的情况下进行源标识,并进行不确定度量化。以大气扩散为例,我们发现,深度神经网络操作器代理的表达和计算效率在适当的降维局限中允许使用有限数据进行源标识和不确定性量化。对于引入具有不同不确定度的风等辅助过程,我们发现,贝叶斯近似误差方法对于可靠性源逆转至关重要。