深度符号回归: 通过寻求风险政策梯度从数据中恢复数学表达式 (Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients)

Discovering the underlying mathematical expressions describing a dataset is a core challenge for artificial intelligence. This is the problem of $\textit{symbolic regression}$. Despite recent advances in training neural networks to solve complex tasks, deep learning approaches to symbolic regression are underexplored. We propose a framework that leverages deep learning for symbolic regression via a simple idea: use a large model to search the space of small models. Specifically, we use a recurrent neural network to emit a distribution over tractable mathematical expressions and employ a novel risk-seeking policy gradient to train the network to generate better-fitting expressions. Our algorithm outperforms several baseline methods (including Eureqa, the gold standard for symbolic regression) in its ability to exactly recover symbolic expressions on a series of benchmark problems, both with and without added noise. More broadly, our contributions include a framework that can be applied to optimize hierarchical, variable-length objects under a black-box performance metric, with the ability to incorporate constraints in situ, and a risk-seeking policy gradient formulation that optimizes for best-case performance instead of expected performance.

翻译：发现描述数据集的基本数学表达式是人工智能的一个核心挑战。这是 $\ textit{symbolic return} 问题。尽管最近在培训神经网络解决复杂任务方面有所进步, 但是对符号回归的深层次学习方法却未得到充分探讨。我们提议了一个框架,通过简单的想法利用深层学习来进行象征性回归: 使用一个大模型搜索小模型的空间。具体地说, 我们使用一个经常性神经网络, 将分布在可移植数学表达式上, 并使用一个新的风险搜索政策梯度来培训网络, 以产生更适合的表达式。我们的算法在完全恢复一系列基准问题中的象征性表达式( 包括Eureqa, 符号回归的金标准) 能力方面优于几种基线方法( 包括 Eureqa, 象征回归的金标准 ), 不论是否添加噪音。更广泛地说, 我们的贡献包括一个框架, 用于在黑盒性能度度度衡量标准下优化等级、可变长对象, 并有能力纳入现场限制, 以及一个风险搜索政策梯度公式, 以优化最佳地表现而不是预期业绩。