In this paper, we propose a novel sequential data-driven method for dealing with equilibrium based chemical simulations, which can be seen as a specific machine learning approach called active learning. The underlying idea of our approach is to consider the function to estimate as a sample of a Gaussian process which allows us to compute the global uncertainty on the function estimation. Thanks to this estimation and with almost no parameter to tune, the proposed method sequentially chooses the most relevant input data at which the function to estimate has to be evaluated to build a surrogate model. Hence, the number of evaluations of the function to estimate is dramatically limited. Our active learning method is validated through numerical experiments and applied to a complex chemical system commonly used in geoscience.
翻译:在本文中,我们提出一种新的连续数据驱动方法,用于处理均衡化学模拟,这可被视为一种称为积极学习的具体机器学习方法。我们方法的基本想法是,将估算功能的功能视为高斯过程的样本,从而使我们能够计算功能估计的全球不确定性。由于这一估算,而且几乎没有参数可调和,拟议方法依次选择了最相关的输入数据,用以评估用于估算功能以建立代孕模型。因此,对用于估算功能的评估数量非常有限。我们的积极学习方法通过数字实验得到验证,并应用于地质科学中常用的复杂化学系统。