Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic programming for symbolic regression. In this chapter we introduce a deterministic symbolic regression algorithm specifically designed to address these issues. The algorithm uses a context-free grammar to produce models that are parameterized by a non-linear least squares local optimization procedure. A finite enumeration of all possible models is guaranteed by structural restrictions as well as a caching mechanism for detecting semantically equivalent solutions. Enumeration order is established via heuristics designed to improve search efficiency. Empirical tests on a comprehensive benchmark suite show that our approach is competitive with genetic programming in many noiseless problems while maintaining desirable properties such as simple, reliable models and reproducibility.
翻译:在以前没有模型结构知识的工业情景中,象征性回归是一种强大的系统识别技术,在这种情景中,通常需要具体的模型特性,例如可解释性、稳健性、可信赖性和可信赖性,而使用象征性回归的基因方案等标准方法,这些特性不容易实现。在本章中,我们引入了一种专门用于解决这些问题的决定性的象征性回归算法。算法使用一种无背景语法来生成模型,这些模型由非线性最小方形的本地优化程序参数化。通过结构性限制以及用于探测等同语言解决方案的缓冲机制来保证对所有可能的模型进行有限的查点。通过旨在改进搜索效率的超自然学方法来建立计算顺序。在综合基准套件上进行的经验测试表明,我们的方法与许多无噪音问题的基因方案编制具有竞争力,同时保持简单、可靠的模型和可复制性等可取的特性。