Multi-objective symbolic regression has the advantage that while the accuracy of the learned models is maximized, the complexity is automatically adapted and need not be specified a-priori. The result of the optimization is not a single solution anymore, but a whole Pareto-front describing the trade-off between accuracy and complexity. In this contribution we study which complexity measures are most appropriately used in symbolic regression when performing multi- objective optimization with NSGA-II. Furthermore, we present a novel complexity measure that includes semantic information based on the function symbols occurring in the models and test its effects on several benchmark datasets. Results comparing multiple complexity measures are presented in terms of the achieved accuracy and model length to illustrate how the search direction of the algorithm is affected.
翻译:多目标象征性回归的优点是,虽然所学模型的准确性最大化,但复杂性是自动调整的,不需要优先指定。优化的结果不再是单一的解决方案,而是描述准确性和复杂性之间的权衡的全Pareto前奏。在这个贡献中,我们研究在与NSGA-II进行多目标优化时,哪些复杂度措施最适于用于象征性回归。此外,我们提出了一个新的复杂度度度量,包括基于模型中出现的函数符号的语义信息,并测试其对几个基准数据集的影响。对多种复杂度的比较结果以达到的准确度和模型长度表示,以说明算法的搜索方向是如何受到影响的。