Learning data representations under uncertainty is an important task that emerges in numerous machine learning applications. However, uncertainty quantification (UQ) techniques are computationally intensive and become prohibitively expensive for high-dimensional data. In this paper, we present a novel surrogate model for representation learning and uncertainty quantification, which aims to deal with data of moderate to high dimensions. The proposed model combines a neural network approach for dimensionality reduction of the (potentially high-dimensional) data, with a surrogate model method for learning the data distribution. We first employ a variational autoencoder (VAE) to learn a low-dimensional representation of the data distribution. We then propose to harness polynomial chaos expansion (PCE) formulation to map this distribution to the output target. The coefficients of PCE are learned from the distribution representation of the training data using a maximum mean discrepancy (MMD) approach. Our model enables us to (a) learn a representation of the data, (b) estimate uncertainty in the high-dimensional data system, and (c) match high order moments of the output distribution; without any prior statistical assumptions on the data. Numerical experimental results are presented to illustrate the performance of the proposed method.
翻译:不确定情况下的学习数据表述是许多机器学习应用中出现的一项重要任务。然而,不确定性量化(UQ)技术是计算密集型的,对于高维数据来说成本极高。在本文中,我们提出了一个用于代表性学习和不确定性量化的新替代模型,目的是处理中度至高度数据。拟议的模型结合了神经网络方法,用于(潜在的高度)数据流的维度减少,并采用代用模型方法来学习数据分布。我们首先使用变式自动电解器(VAE)来学习数据分布的低维度代表。然后我们提议利用多位混杂扩大(PCE)的配方来将这种分布映射到产出目标。PCE的系数是利用最大平均值差异(MD)方法从培训数据的分布中学习的。我们的模型使我们能够(a) 学习数据的描述,(b) 估计高维数据系统的不确定性,以及(c) 匹配产出分布的高度秩序时刻;不事先对数据作任何统计假设。Numerical实验结果展示了拟议方法。