This position paper summarizes a recently developed research program focused on inference in the context of data centric science and engineering applications, and forecasts its trajectory forward over the next decade. Often one endeavours in this context to learn complex systems in order to make more informed predictions and high stakes decisions under uncertainty. Some key challenges which must be met in this context are robustness, generalizability, and interpretability. The Bayesian framework addresses these three challenges elegantly, while bringing with it a fourth, undesirable feature: it is typically far more expensive than its deterministic counterparts. In the 21st century, and increasingly over the past decade, a growing number of methods have emerged which allow one to leverage cheap low-fidelity models in order to precondition algorithms for performing inference with more expensive models and make Bayesian inference tractable in the context of high-dimensional and expensive models. Notable examples are multilevel Monte Carlo (MLMC), multi-index Monte Carlo (MIMC), and their randomized counterparts (rMLMC), which are able to provably achieve a dimension-independent (including $\infty-$dimension) canonical complexity rate with respect to mean squared error (MSE) of $1/$MSE. Some parallelizability is typically lost in an inference context, but recently this has been largely recovered via novel double randomization approaches. Such an approach delivers i.i.d. samples of quantities of interest which are unbiased with respect to the infinite resolution target distribution. Over the coming decade, this family of algorithms has the potential to transform data centric science and engineering, as well as classical machine learning applications such as deep learning, by scaling up and scaling out fully Bayesian inference.
翻译:这份立场文件总结了最近开发的一个研究方案,其重点是数据中心科学和工程应用背景下的推断,并预测其未来十年的轨迹。在这方面,往往有人努力学习复杂的系统,以便在不确定的情况下作出更知情的预测和高利害决定。在这方面必须应对的一些关键挑战是稳健、可通性和可解释性。贝叶西亚框架优雅地应对这三项挑战,同时带来了第四个不可取的特点:通常比其确定性对应机构的成本要高得多。在21世纪,而且过去十年中,越来越多的方法出现,使得人们能够利用廉价的低利害模型来利用廉价的低利害模型,以便用更昂贵的模型进行推断和高能和昂贵模型的前提性推论。值得注意的例子有多个层次的蒙特卡洛(MLMC)、多指数的蒙特卡洛(MIMC),以及他们随机化的对应机构(rMMC),这些方法能够实现一个可识别的维度(包括美元-美元-美元-升利值-升利值)的应用。 典型的IMS/Slent复杂度是这种可理解性的方法,最近逐渐递增的精确度,可以追溯化地在10年中逐渐变。