Surrogate-modelling techniques including Polynomial Chaos Expansion (PCE) is commonly used for statistical estimation (aka. Uncertainty Quantification) of quantities of interests obtained from expensive computational models. PCE is a data-driven regression-based technique that relies on spectral polynomials as basis-functions. In this technique, the outputs of few numerical simulations are used to estimate the PCE coefficients within a regression framework combined with regularization techniques where the regularization parameters are estimated using standard cross-validation as applied in supervised machine learning methods. In the present work, we introduce an efficient method for estimating the PCE coefficients combining Elastic Net regularization with a data-driven feature ranking approach. Our goal is to increase the probability of identifying the most significant PCE components by assigning each of the PCE coefficients a numerical value reflecting the magnitude of the coefficient and its stability with respect to perturbations in the input data. In our evaluations, the proposed approach has shown high convergence rate for high-dimensional problems, where standard feature ranking might be challenging due to the curse of dimensionality. The presented method is implemented within a standard machine learning library (Scikit-learn) allowing for easy experimentation with various solvers and regularization techniques (e.g. Tikhonov, LASSO, LARS, Elastic Net) and enabling automatic cross-validation techniques using a widely used and well tested implementation. We present a set of numerical tests on standard analytical functions, a two-phase subsurface flow model and a simulation dataset for CO2 sequestration in a saline aquifer. For all test cases, the proposed approach resulted in a significant increase in PCE convergence rates.
翻译:超模建模技术,包括多球混亂扩展(PCE),通常用于统计估计(aa. 不确定量化)从昂贵的计算模型中获得的利益量的统计估计(aka. 不确定的量化),PCE是一种数据驱动的回归基础技术,以光谱多球函数作为基础功能。在这一技术中,利用少数数字模拟的输出,在回归框架内估算PCE系数,同时结合正规化技术,在监管的机器学习方法中使用的标准交叉校准参数估算。在目前的工作中,我们采用了一种有效的方法,将 Elastic 网的自动递解法与数据驱动的特征排序相结合,将 Elastic 网的自动递归法与数据驱动的特征排序方法结合起来。我们的目标是通过分配每个PCEE系数的数值来增加确定最重要的PCE的概率,反映系数的大小及其在投入数据中与扰动性数据方面的所有稳定性。在我们的评估中,拟议的方法显示高度测试率的趋同率,而标准特征排序可能因程度的诅咒而具有挑战性。 提出的方法是:在透明化系统递解模式中,采用有利于性的分析技术,在系统测试中采用一种标准系统升级中采用。