In recent years, deep learning techniques have been developed to improve the performance of program synthesis from input-output examples. Albeit its significant progress, the programs that can be synthesized by state-of-the-art approaches are still simple in terms of their complexity. In this work, we move a significant step forward along this direction by proposing a new class of challenging tasks in the domain of program synthesis from input-output examples: learning a context-free parser from pairs of input programs and their parse trees. We show that this class of tasks are much more challenging than previously studied tasks, and the test accuracy of existing approaches is almost 0%. We tackle the challenges by developing three novel techniques inspired by three novel observations, which reveal the key ingredients of using deep learning to synthesize a complex program. First, the use of a non-differentiable machine is the key to effectively restrict the search space. Thus our proposed approach learns a neural program operating a domain-specific non-differentiable machine. Second, recursion is the key to achieve generalizability. Thus, we bake-in the notion of recursion in the design of our non-differentiable machine. Third, reinforcement learning is the key to learn how to operate the non-differentiable machine, but it is also hard to train the model effectively with existing reinforcement learning algorithms from a cold boot. We develop a novel two-phase reinforcement learning-based search algorithm to overcome this issue. In our evaluation, we show that using our novel approach, neural parsing programs can be learned to achieve 100% test accuracy on test inputs that are 500x longer than the training samples.
翻译:近些年来,我们开发了深层次的学习技术来改进从投入输出实例中产生的程序合成的绩效。 尽管它取得了显著的进展, 可以用最先进的方法合成的程序在复杂性方面仍然很简单。 在这项工作中, 我们沿着这个方向迈出了一大步, 在输入输出实例中提出了在程序合成领域具有挑战性的新任务类别: 学习一对输入- 输出程序及其粗糙的树来进行无背景的剖析器。 我们显示, 与以前研究的任务相比, 这一类任务比以往研究的更具有挑战性, 并且现有方法的测试准确性几乎是0 % 。 我们通过三种新发现, 开发三种新颖的方法来应对挑战。 我们开发三种新颖的技术, 揭示了使用深层学习来合成复杂程序的关键要素。 首先, 使用非差异机器是有效限制搜索空间的关键。 因此, 我们的拟议方法学习了一个神经程序, 运行一个特定领域、 不可区别的机器。 第二, 重新思考是实现普遍性的关键。 因此, 我们用循环的观念来重新思考, 来重新思考, 在设计我们强化的升级的系统中, 学习一个不易变换的系统, 学习, 学习一个不易的系统测试, 是如何学习, 学习。