Neural networks can be used to learn the solution of partial differential equations (PDEs) on arbitrary domains without requiring a computational mesh. Common approaches integrate differential operators in training neural networks using a structured loss function. The most common training algorithm for neural networks is backpropagation which relies on the gradient of the loss function with respect to the parameters of the network. In this work, we characterize the difficulty of training neural networks on physics by investigating the impact of differential operators in corrupting the back propagated gradients. Particularly, we show that perturbations present in the output of a neural network model during early stages of training lead to higher levels of noise in a structured loss function that is composed of high-order differential operators. These perturbations consequently corrupt the back-propagated gradients and impede convergence. We mitigate this issue by introducing auxiliary flux parameters to obtain a system of first-order differential equations. We formulate a non-linear unconstrained optimization problem using the augmented Lagrangian method that properly constrains the boundary conditions and adaptively focus on regions of higher gradients that are difficult to learn. We apply our approach to learn the solution of various benchmark PDE problems and demonstrate orders of magnitude improvement over existing approaches.
翻译:使用神经网络的最常见培训算法是反向演化,它依赖与网络参数有关的损失函数梯度。在这项工作中,我们通过调查不同操作员在腐蚀后传播梯度方面的影响,来说明对物理神经网络进行培训的困难。特别是,我们表明,在培训的早期阶段,神经网络模型产出中的扰动导致在结构化损失功能中产生更高程度的噪音,而结构化损失功能是由高排序差异操作员组成的。这些扰动导致反向调整梯度腐蚀并阻碍趋同。我们通过引入辅助通量参数来缓解这一问题,以获得一级差异方程式系统。我们使用强化的拉格朗加法来制定非线性、不受限制的优化问题。我们采用强化的拉格朗加法,适当限制边界条件和适应性地关注高梯度区域,难以学习。我们采用这些方法来学习各种程度的改进办法。我们采用的方法,学习各种程度的解决方案。