The problem of approximating smooth, multivariate functions from sample points arises in many applications in scientific computing, e.g., in computational Uncertainty Quantification (UQ) for science and engineering. In these applications, the target function may represent a desired quantity of interest of a parameterized Partial Differential Equation (PDE). Due to the large cost of solving such problems, where each sample is computed by solving a PDE, sample efficiency is a key concerning these applications. Recently, there has been increasing focus on the use of Deep Neural Networks (DNN) and Deep Learning (DL) for learning such functions from data. In this work, we propose an adaptive sampling strategy, CAS4DL (Christoffel Adaptive Sampling for Deep Learning) to increase the sample efficiency of DL for multivariate function approximation. Our novel approach is based on interpreting the second to last layer of a DNN as a dictionary of functions defined by the nodes on that layer. With this viewpoint, we then define an adaptive sampling strategy motivated by adaptive sampling schemes recently proposed for linear approximation schemes, wherein samples are drawn randomly with respect to the Christoffel function of the subspace spanned by this dictionary. We present numerical experiments comparing CAS4DL with standard Monte Carlo (MC) sampling. Our results demonstrate that CAS4DL often yields substantial savings in the number of samples required to achieve a given accuracy, particularly in the case of smooth activation functions, and it shows a better stability in comparison to MC. These results therefore are a promising step towards fully adapting DL towards scientific computing applications.
翻译:在科学计算的许多应用中,例如,在科学和工程的计算不确定量化(UQ)中,抽样点的相似性、多变量功能问题出现在科学计算的许多应用中。在这些应用中,目标功能可能代表参数化部分差异化(PDE)的预期兴趣。由于解决这些问题的成本巨大,每个样本都是通过解决PDE来计算,抽样效率是这些应用的关键。最近,人们越来越重视利用深神经网络(DNNN)和深学习(DL)来从数据中学习这类功能。在这项工作中,我们建议采用适应性抽样战略,CAS4DL(Christoffice Sanditive Sampling for Deeple Learning)来提高DL的样本效率。由于解决这些问题的成本很大,每个样本都是通过解决PDE来计算的,因此,抽样效率是这些应用的关键。因此,我们随后确定了一个适应性抽样战略,根据最近为直线式基点化基比率计划提出的调整性抽样功能,其中我们用数字级标准的样本显示我们目前的CAS4的数值比值。