Many problems in computational science and engineering can be described in terms of approximating a smooth function of $d$ variables, defined over an unknown domain of interest $\Omega\subset \mathbb{R}^d$, from sample data. Here both the curse of dimensionality ($d\gg 1$) and the lack of domain knowledge with $\Omega$ potentially irregular and/or disconnected are confounding factors for sampling-based methods. Na\"{i}ve approaches often lead to wasted samples and inefficient approximation schemes. For example, uniform sampling can result in upwards of 20\% wasted samples in some problems. In surrogate model construction in computational uncertainty quantification (UQ), the high cost of computing samples needs a more efficient sampling procedure. In the last years, methods for computing such approximations from sample data have been studied in the case of irregular domains. The advantages of computing sampling measures depending on an approximation space $P$ of $\dim(P)=N$ have been shown. In particular, such methods confer advantages such as stability and well-conditioning, with $\mathcal{O}(N\log(N))$ as sample complexity. The recently-proposed adaptive sampling for general domains (ASGD) strategy is one method to construct these sampling measures. The main contribution of this paper is to improve ASGD by adaptively updating the sampling measures over unknown domains. We achieve this by first introducing a general domain adaptivity strategy (GDAS), which approximates the function and domain of interest from sample points. Second, we propose adaptive sampling for unknown domains (ASUD), which generates sampling measures over a domain that may not be known in advance. Then, we derive least squares techniques for polynomial approximation on unknown domains. Numerical results show that the ASUD approach can reduce the computational cost by as 50\% when compared with uniform sampling.
翻译:计算科学和工程方面的许多问题可以描述为:计算科学和工程方面的许多问题,大约相当于50美元变量的平稳功能,这些变量的定义来自抽样数据。这里是维度的诅咒(d\gg 1美元)和对以美元为潜在不定期和(或)断开的域知识的缺乏,是取样方法的混杂因素。纳\"{i}ve 方法往往导致样本浪费和低效的近似更新计划。例如,统一取样可能会在某些问题中造成20 ⁇ 浪费的样本。在计算不确定性量化的替代模型建设中,计算样本的高成本需要更高效的取样程序。在过去几年中,对以美元为主的域进行抽样数据的计算方法进行了研究。根据近似空间为美元=dimm(P)=N=美元计算取样方法的优点已经显现出来。特别是,这种方法具有稳定性和调节的优势,在有些问题中,以美元为正值的域内,以美元为最不易变现的域域值计算,而这种方法则由我们先算出一个不甚易变现的域域。