Many problems in computational science and engineering can be described in terms of approximating a smooth function of $d$ variables, defined over an unknown domain of interest $\Omega\subset \mathbb{R}^d$, from sample data. Here both the curse of dimensionality ($d\gg 1$) and the lack of domain knowledge with $\Omega$ potentially irregular and/or disconnected are confounding factors for sampling-based methods. Na\"{i}ve approaches often lead to wasted samples and inefficient approximation schemes. For example, uniform sampling can result in upwards of 20\% wasted samples in some problems. In surrogate model construction in computational uncertainty quantification (UQ), the high cost of computing samples needs a more efficient sampling procedure. In the last years, methods for computing such approximations from sample data have been studied in the case of irregular domains. The advantages of computing sampling measures depending on an approximation space $P$ of $\dim(P)=N$ have been shown. In particular, such methods confer advantages such as stability and well-conditioning, with $\mathcal{O}(N\log(N))$ as sample complexity. The recently-proposed adaptive sampling for general domains (ASGD) strategy is one method to construct these sampling measures. The main contribution of this paper is to improve ASGD by adaptively updating the sampling measures over unknown domains. We achieve this by first introducing a general domain adaptivity strategy (GDAS), which approximates the function and domain of interest from sample points. Second, we propose adaptive sampling for unknown domains (ASUD), which generates sampling measures over a domain that may not be known in advance. Our results show that the ASUD approach consistently achieves the same or smaller errors as uniform sampling, but using fewer, and often significantly fewer evaluations.
翻译:计算科学和工程方面的许多问题可以描述为:计算学和工程学方面的许多问题,从抽样数据来看,在一个未知的不为人知的单一利息域上定义$\Omega\subset\mathbb{R ⁇ d$。这里,维度的诅咒(d\gg 1美元),以及缺乏对美元(d\gg 1美元)的域知识,都可能是不规则的和/或脱节的抽样方法的混杂因素。Na\"{i}ve 方法往往导致样本浪费和低效率的近似方法。例如,统一取样可能会在某些问题上导致20 ⁇ 浪费的样品。在计算不确定性量化的替代模型中,计算样品的高成本需要更高效的取样程序。在过去几年中,对抽样数据的精确方法进行了同样的研究。计算方法取决于近似空间的 $P$\dimm=N=N$(P) 的方法往往导致稳定性和稳妥度的升级, 以美元为基准域域域域的精确度计算结果,而这种方法则显示的是,从一个不易变现的取样方法,从一个变现的精确度战略,从一个变现为我们。