As research in automatically detecting bugs grows and produces new techniques, having suitable collections of programs with known bugs becomes crucial to reliably and meaningfully compare the effectiveness of these techniques. Most of the existing approaches rely on benchmarks collecting manually curated real-world bugs, or synthetic bugs seeded into real-world programs. Using real-world programs entails that extending the existing benchmarks or creating new ones remains a complex time-consuming task. In this paper, we propose a complementary approach that automatically generates programs with seeded bugs. Our technique, called HyperPUT, builds C programs from a "seed" bug by incrementally applying program transformations (introducing programming constructs such as conditionals, loops, etc.) until a program of the desired size is generated. In our experimental evaluation, we demonstrate how HyperPUT can generate buggy programs that can challenge in different ways the capabilities of modern bug-finding tools, and some of whose characteristics are comparable to those of bugs in existing benchmarks. These results suggest that HyperPUT can be a useful tool to support further research in bug-finding techniques -- in particular their empirical evaluations.
翻译:随着自动检测错误的研究不断增长并产生新的技术,拥有已知错误的适当程序集对于可靠和有意义地比较这些技术的有效性至关重要。 大部分现有方法都依赖于收集人工构建真实世界错误的基准, 或合成错误被植入真实世界程序的基准。 使用现实世界程序意味着延长现有基准或创建新基准仍是一项复杂的耗时任务。 在本文件中, 我们提出一种补充方法, 自动生成种子错误的程序。 我们的技术叫做超强PUT, 通过渐进应用程序转换( 引入有条件的、循环等程序构建等)从“ 种子” 错误建立 C 程序, 直至产生预期规模的方案。 在我们的实验性评估中, 我们演示超强者PUT 如何产生错误程序, 以不同的方式挑战现代错误调查工具的能力, 以及某些程序的特点可以与现有基准中的错误相似。 这些结果表明, 超强者PUT 可以成为支持对错误调查技术进行进一步研究的有用工具, 特别是其实证性评估。