We propose derivative-informed neural operators (DINOs), a general family of neural networks to approximate operators as infinite-dimensional mappings from input function spaces to output function spaces or quantities of interest. After discretizations both inputs and outputs are high-dimensional. We aim to approximate not only the operators with improved accuracy but also their derivatives (Jacobians) with respect to the input function-valued parameter to empower derivative-based algorithms in many applications, e.g., Bayesian inverse problems, optimization under parameter uncertainty, and optimal experimental design. The major difficulties include the computational cost of generating derivative training data and the high dimensionality of the problem leading to large training cost. To address these challenges, we exploit the intrinsic low-dimensionality of the derivatives and develop algorithms for compressing derivative information and efficiently imposing it in neural operator training yielding derivative-informed neural operators. We demonstrate that these advances can significantly reduce the costs of both data generation and training for large classes of problems (e.g., nonlinear steady state parametric PDE maps), making the costs marginal or comparable to the costs without using derivatives, and in particular independent of the discretization dimension of the input and output functions. Moreover, we show that the proposed DINO achieves significantly higher accuracy than neural operators trained without derivative information, for both function approximation and derivative approximation (e.g., Gauss-Newton Hessian), especially when the training data are limited.
翻译:我们提出衍生信息神经操作员(DINOs),这是神经网络的普通组合,将操作员视为从输入功能空间到输出功能空间或兴趣量的无限范围绘图。在分解投入和产出后都是高维的。我们不仅旨在将操作员的精度提高,而且将其衍生物(Jacobians)与输入功能价值值参数相近,以便在许多应用中赋予衍生物根据算法的能力,例如Bayesian反向问题,在参数不确定性和最佳实验设计下优化。主要困难包括生成衍生物培训数据的计算成本,以及导致大量培训成本的高度多维度。为了应对这些挑战,我们利用衍生物的内在的低维度算法来压缩衍生物信息,并有效地将其引入产生衍生物知情神经操作员的神经操作员培训中。我们证明这些进步可以大幅降低数据生成和大规模问题培训的成本(例如,非线性状态的PDE参数地图),使成本在不使用衍生物的情况下处于边际或可与成本相比的高维度。我们利用衍生物的衍生物的计算方法,并且特别独立地标化了离质数据操作者的输出功能。