Library learning compresses a given corpus of programs by extracting common structure from the corpus into reusable library functions. Prior work on library learning suffers from two limitations that prevent it from scaling to larger, more complex inputs. First, it explores too many candidate library functions that are not useful for compression. Second, it is not robust to syntactic variation in the input. We propose library learning modulo theory (LLMT), a new library learning algorithm that additionally takes as input an equational theory for a given problem domain. LLMT uses e-graphs and equality saturation to compactly represent the space of programs equivalent modulo the theory, and uses a novel e-graph anti-unification technique to find common patterns in the corpus more directly and efficiently. We implemented LLMT in a tool named BABBLE. Our evaluation shows that BABBLE achieves better compression orders of magnitude faster than the state of the art. We also provide a qualitative evaluation showing that BABBLE learns reusable functions on inputs previously out of reach for library learning.
翻译:图书馆通过将共同结构从文稿中提取到可再使用的图书馆功能,压缩了一个特定程序。 图书馆先前的学习工作有两个限制, 使图书馆无法推广到更大、更复杂的投入中。 首先, 它探索了太多不适合压缩的候选图书馆功能。 第二, 它不适于合成输入的变异。 我们建议图书馆学习模版理论( LLMT), 这是一种新的图书馆学习算法, 并额外将某一问题域的方程理论作为输入输入。 LLMT 使用电子绘图和平等饱和来缩略地代表相当于该理论的程式空间, 并使用新的电子电报反统一技术来更直接、更高效地查找文体中的共同模式。 我们用一个名为 BABBBBLMT 的工具实施了LMT。 我们的评估显示, BABBBBBLB 的压缩数量顺序比艺术的状态要快。 我们还提供定性评价, 显示 BABBBBBBBBLE学会了以前用于图书馆学习所要达到的投入的可重复的功能。