Physics-informed machine learning and inverse modeling require the solution of ill-conditioned non-convex optimization problems. First-order methods, such as SGD and ADAM, and quasi-Newton methods, such as BFGS and L-BFGS, have been applied with some success to optimization problems involving deep neural networks in computational engineering inverse problems. However, empirical evidence shows that convergence and accuracy for these methods remain a challenge. Our study unveiled at least two intrinsic defects of these methods when applied to coupled systems of partial differential equations (PDEs) and deep neural networks (DNNs): (1) convergence is often slow with long plateaus that make it difficult to determine whether the method has converged or not; (2) quasi-Newton methods do not provide a sufficiently accurate approximation of the Hessian matrix; this typically leads to early termination (one of the stopping criteria of the optimizer is satisfied although the achieved error is far from minimal). Based on these observations, we propose to use trust region methods for optimizing coupled systems of PDEs and DNNs. Specifically, we developed an algorithm for second-order physics constrained learning, an efficient technique to calculate Hessian matrices based on computational graphs. We show that trust region methods overcome many of the defects and exhibit remarkable fast convergence and superior accuracy compared to ADAM, BFGS, and L-BFGS.
翻译:物理上知情的机器学习和反建模需要解决条件不成熟的非电离层优化问题。一阶方法,如SGD和ADAM,以及准牛顿方法,如BFGS和L-BFGS,已经应用并取得了一定的成功,以优化计算工程逆向问题的深神经网络;然而,经验证据表明,这些方法的趋同和准确性仍然是一个挑战。我们的研究揭示了这些方法在应用部分差分方程(PDEs)和深神经网络(DNNs)的结合系统时至少有两个内在缺陷:(1) 趋同速度往往缓慢,高高地难以确定该方法是否趋同;(2) 准牛顿方法并不能充分准确地接近赫西安矩阵;这通常导致早期终止(优化标准之一得到满足,但已经实现的错误还远远小 )。基于这些观察,我们提议使用信任区域方法优化PDEs和DNNS的组合系统。具体地说,我们为二阶级物理学和高端物理学的精确度测算法,比HAMA公司快速测算法。我们用了甚高的SBGIS测算方法,以快速测算法,以快速测测测测测测测了甚的S。