We target the problem of synthesizing proofs of semantic equivalence between two programs made of sequences of statements with complex symbolic expressions. We propose a neural network architecture based on the transformer to generate axiomatic proofs of equivalence between program pairs. We generate expressions which include scalars and vectors and support multi-typed rewrite rules to prove equivalence. For training the system, we develop an original training technique, which we call self-supervised sample selection. This incremental training improves the quality, generalizability and extensibility of the learned model. We study the effectiveness of the system to generate proofs of increasing length, and we demonstrate how transformer models learn to represent complex and verifiable symbolic reasoning. Our system, S4Eq, achieves 97% proof success on 10,000 pairs of programs while ensuring zero false positives by design.
翻译:我们的目标是将两个程序之间由具有复杂符号表达式的语句序列制作的语义等同性证据综合起来的问题。 我们提议基于变压器的神经网络结构, 以生成程序对对等的不言而喻的证明。 我们生成表达式, 包括电路和矢量, 支持多类型重写规则以证明等同性。 为了对系统进行培训, 我们开发了一种原始的培训技术, 我们称之为自我监督的样本选择。 这种递增培训提高了所学模型的质量、 通用性和可推广性。 我们研究系统的有效性, 以生成时间越来越长的证明, 我们演示变压器模型如何学会代表复杂和可核查的符号推理。 我们的系统S4Eq, 在1万对程序上实现了97%的验证成功率, 同时通过设计确保零假正数。