The field of software verification has produced a wide array of algorithmic techniques that can prove a variety of properties of a given program. It has been demonstrated that the performance of these techniques can vary up to 4 orders of magnitude on the same verification problem. Even for verification experts, it is difficult to decide which tool will perform best on a given problem. For general users, deciding the best tool for their verification problem is effectively impossible. In this work, we present Graves, a selection strategy based on graph neural networks (GNNs). Graves generates a graph representation of a program from which a GNN predicts a score for a verifier that indicates its performance on the program. We evaluate Graves on a set of 10 verification tools and over 8000 verification problems and find that it improves the state-of-the-art in verification algorithm selection by 11\%. We conjecture this is in part due to Graves' use of GNNs with attention mechanisms. Through a qualitative study on model interpretability, we find strong evidence that the Graves' GNN-based model learns to base its predictions on factors that relate to the unique features of the algorithmic techniques.
翻译:软件核查领域产生了一系列广泛的算法技术,可以证明某个程序的各种特性。 已经证明这些技术的性能可以在同一核查问题上有4个数量级的大小。 即使对核查专家来说,也很难决定哪个工具在某个特定问题上最有效。 对于一般用户来说,决定其核查问题的最佳工具实际上是不可能的。 在这项工作中,我们介绍了基于图形神经网络(GNNSs)的筛选战略Graves。 Graves生成了一个程序图示,其中GNNs预测一个验证员的分数,表明其在程序上的性能。我们用一套10个核查工具和8000多个核查问题来评估Graves,发现它改进了11 ⁇ 的核查算法选择方面的最新技术。我们推测这部分是由于Graves使用GNNSs的注意机制。通过对模型可解释性进行定性研究,我们发现强有力的证据表明,Granves的模型能够根据与算法的独特特性有关的因素作出预测。