A statistical emulator can be used as a surrogate of complex physics-based calculations to drastically reduce the computational cost. Its successful implementation hinges on an accurate representation of the nonlinear response surface with a high-dimensional input space. Conventional ``space-filling'' designs, including random sampling and Latin hypercube sampling, become inefficient as the dimensionality of the input variables increases, and the predictive accuracy of the emulator can degrade substantially for a test input distant from the training input set. To address this fundamental challenge, we develop a reliable emulator for predicting complex functionals by active learning with error control (ALEC). The algorithm is applicable to infinite-dimensional mapping with high-fidelity predictions and a controlled predictive error. The computational efficiency has been demonstrated by emulating the classical density functional theory (cDFT) calculations, a statistical-mechanical method widely used in modeling the equilibrium properties of complex molecular systems. We show that ALEC is much more accurate than conventional emulators based on the Gaussian processes with ``space-filling'' designs and alternative active learning methods. Besides, it is computationally more efficient than direct cDFT calculations. ALEC can be a reliable building block for emulating expensive functionals owing to its minimal computational cost, controllable predictive error, and fully automatic features.
翻译:统计模拟器可以用作复杂的物理计算替代器, 以大幅降低计算成本。 成功实施该模拟器取决于对非线性反应表面的准确表示, 并具有高维输入空间。 常规“ 空间填充” 的设计, 包括随机抽样和拉丁超立方体取样, 效率随着输入变量的维度的提高而提高, 而模拟器的预测准确度可以大幅降低, 用于远离培训输入器的测试输入。 为了应对这一根本性挑战, 我们开发了一个可靠的模拟器, 用于通过主动学习错误控制( ALEC) 来预测复杂功能。 算法适用于具有高不端预测和受控预测错误的无线性绘图。 计算效率的证明是模拟典型密度功能理论( CDFT) 的计算, 这是一种统计- 机械化方法, 广泛用于模拟复杂分子系统的均衡特性。 我们显示, ALEC 与基于高正比常规的模拟器比常规的模拟器更准确得多, 以 空间填充空性控制( ALEC) 的自动计算方法可以用来进行更精确的计算, 并且进行更精确的计算, 也能够进行更精确的计算。 。