This position paper summarizes a recently developed research program focused on inference in the context of data centric science and engineering applications, and forecasts its trajectory forward over the next decade. Often one endeavours in this context to learn complex systems in order to make more informed predictions and high stakes decisions under uncertainty. Some key challenges which must be met in this context are robustness, generalizability, and interpretability. The Bayesian framework addresses these three challenges, while bringing with it a fourth, undesirable feature: it is typically far more expensive than its deterministic counterparts. In the 21st century, and increasingly over the past decade, a growing number of methods have emerged which allow one to leverage cheap low-fidelity models in order to precondition algorithms for performing inference with more expensive models and make Bayesian inference tractable in the context of high-dimensional and expensive models. Notable examples are multilevel Monte Carlo (MLMC), multi-index Monte Carlo (MIMC), and their randomized counterparts (rMLMC), which are able to provably achieve a dimension-independent (including $\infty-$dimension) canonical complexity rate with respect to mean squared error (MSE) of $1/$MSE. Some parallelizability is typically lost in an inference context, but recently this has been largely recovered via novel double randomization approaches. Such an approach delivers i.i.d. samples of quantities of interest which are unbiased with respect to the infinite resolution target distribution. Over the coming decade, this family of algorithms has the potential to transform data centric science and engineering, as well as classical machine learning applications such as deep learning, by scaling up and scaling out fully Bayesian inference.
翻译:这份立场文件总结了最近开发的一个研究方案,其重点是数据中心科学和工程应用背景下的推断,并预测其未来十年的轨迹。在这方面,往往有人努力学习复杂的系统,以便作出更知情的预测和不确定性的高风险决定。在这一背景下必须应对的一些关键挑战是稳健性、可概括性和可解释性。贝叶西亚框架处理这三个挑战,同时带来了第四个不可取的特征:它通常比其确定性对应机构的成本要高得多。在21世纪,而且过去十年中,日益出现越来越多的方法,使得人们能够利用廉价的低纤维模型来利用廉价的低纤维模型,以便用更昂贵模型进行推断的先决条件性算法,并使巴伊西亚的推论在高维度和昂贵模型中可以理解。值得注意的例子是多层次的蒙特卡洛(ML)、多指数的蒙特卡洛(MIMC),以及它们随机化的对应机构(rMLMC),它们能够实现一个可识别的维度(包括美元-美元-digency-dial dicial dialalal)应用方法, 将一个典型的精确性成本-distreval-liversal viewalalal liversalalalalalalalalalalal rele lax lax acal lax acal le lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax lax le le lax lax lax lax lax lax lax lax lax lax a lax lax lax lax le le le le le le le le le