图书馆规格的数据-驱动数据直导推断 (Data-Driven Abductive Inference of Library Specifications)

Programmers often leverage data structure libraries that provide useful and reusable abstractions. Modular verification of programs that make use of these libraries naturally rely on specifications that capture important properties about how the library expects these data structures to be accessed and manipulated. However, these specifications are often missing or incomplete, making it hard for clients to be confident they are using the library safely. When library source code is also unavailable, as is often the case, the challenge to infer meaningful specifications is further exacerbated. In this paper, we present a novel data-driven abductive inference mechanism that infers specifications for library methods sufficient to enable verification of the library's clients. Our technique combines a data-driven learning-based framework to postulate candidate specifications, along with SMT-provided counterexamples to refine these candidates, taking special care to prevent generating specifications that overfit to sampled tests. The resulting specifications form a minimal set of requirements on the behavior of library implementations that ensures safety of a particular client program. Our solution thus provides a new multi-abduction procedure for precise specification inference of data structure libraries guided by client-side verification tasks. Experimental results on a wide range of realistic OCaml data structure programs demonstrate the effectiveness of the approach.

翻译：程序设计者往往利用数据结构图书馆,提供有用和可再使用的抽象数据。对使用这些图书馆的程序进行模块化核查时,自然地依赖于能够捕捉图书馆期望如何访问和操纵这些数据结构的重要属性的规格;然而,这些规格往往缺乏或不完整,使客户难以确信他们是否安全地使用图书馆。当图书馆源代码也缺乏时,推断有意义的规格的挑战也像通常的情况一样会进一步加剧。在本文件中,我们提出了一个新的数据驱动的绑架推论机制,其中推断出图书馆方法的规格足以核实图书馆客户。我们的技术结合了一个基于数据驱动的学习框架,以假定候选人的规格,以及SMT提供的反抽样,以完善这些候选人,特别注意防止产生超出抽样测试的规格。由此产生的规格对确保特定客户程序安全的图书馆实施行为提出了一套最低要求。我们的解决办法因此为精确地说明由客户端核查任务指导的数据结构图书馆的规格提供了新的多度推导程序。实验结果展示了广泛范围的数据结构的有效性。