Interpretable regression models are important for many application domains, as they allow experts to understand relations between variables from sparse data. Symbolic regression addresses this issue by searching the space of all possible free form equations that can be constructed from elementary algebraic functions. While explicit mathematical functions can be rediscovered this way, the determination of unknown numerical constants during search has been an often neglected issue. We propose a new multi-objective memetic algorithm that exploits a differentiable Cartesian Genetic Programming encoding to learn constants during evolutionary loops. We show that this approach is competitive or outperforms machine learned black box regression models or hand-engineered fits for two applications from space: the Mars express thermal power estimation and the determination of the age of stars by gyrochronology.
翻译:对许多应用领域来说,解释回归模型很重要,因为它们使专家能够理解来自稀少数据的变量之间的关系。符号回归模型通过搜索从基本代数函数中可以构建的所有可能的自由形式方程式的空间来解决这个问题。虽然可以这样重新发现明确的数学函数,但在搜索过程中确定未知的数字常数是一个常常被忽视的问题。我们提出了一个新的多目标计量算法,利用一种不同的卡泰尔基因编程编码来学习进化循环中的常数。我们表明,这一方法具有竞争性,或优于机器所学的黑盒回归模型,或手动设计适合来自空间的两个应用:火星明示热能估计和由地球物理学确定恒星的年龄。