Gradient-based local optimization has been shown to improve results of genetic programming (GP) for symbolic regression. Several state-of-the-art GP implementations use iterative nonlinear least squares (NLS) algorithms such as the Levenberg-Marquardt algorithm for local optimization. The effectiveness of NLS algorithms depends on appropriate scaling and conditioning of the optimization problem. This has so far been ignored in symbolic regression and GP literature. In this study we use a singular value decomposition of NLS Jacobian matrices to determine the numeric rank and the condition number. We perform experiments with a GP implementation and six different benchmark datasets. Our results show that rank-deficient and ill-conditioned Jacobian matrices occur frequently and for all datasets. The issue is less extreme when restricting GP tree size and when using many non-linear functions in the function set.
翻译:基于梯度的本地优化已被证明可以改善基因编程(GP)的象征性回归效果。一些最先进的GP实施过程使用了迭代的非线性最小方程式算法,例如用于本地优化的Levenberg-Marquardt算法。 NLS算法的有效性取决于优化问题的适当规模和调节。迄今为止,在象征性回归和GP文献中忽略了这一点。在这项研究中,我们使用NLS Jacobian矩阵的单值分解来确定数字级和条件编号。我们用一个GP实施和六个不同的基准数据集进行实验。我们的结果显示,对于所有数据集来说,等级不完善和条件不完善的Jacobian矩阵经常发生。当限制GP树大小和在函数集中使用许多非线性函数时,问题就不那么极端了。