Symbolic regression is a nonlinear regression method which is commonly performed by an evolutionary computation method such as genetic programming. Quantification of uncertainty of regression models is important for the interpretation of models and for decision making. The linear approximation and so-called likelihood profiles are well-known possibilities for the calculation of confidence and prediction intervals for nonlinear regression models. These simple and effective techniques have been completely ignored so far in the genetic programming literature. In this work we describe the calculation of likelihood profiles in details and also provide some illustrative examples with models created with three different symbolic regression algorithms on two different datasets. The examples highlight the importance of the likelihood profiles to understand the limitations of symbolic regression models and to help the user taking an informed post-prediction decision.
翻译:符号回归是一种非线性回归方法,通常通过基因编程等渐进计算方法进行。回归模型不确定性的量化对于模型的解释和决策十分重要。线性近似值和所谓的概率剖面是计算非线性回归模型的信心和预测间隔期的众所周知的可能性。这些简单而有效的技术迄今为止在遗传方案编制文献中被完全忽视。在这项工作中,我们详细描述可能性剖面的计算,并提供一些示例,说明以三种不同的符号回归算法创建的模型对两个不同的数据集的重要性。这些实例强调了概率剖面对于理解符号回归模型的局限性和帮助用户作出知情的预测后决定的重要性。