This work focuses on effectively generating diverse solutions for satisfiability modulo theories (SMT) formulas, targeting the theories of bit-vectors, arrays, and uninterpreted functions, which is a critical task in software and hardware testing. Generating diverse SMT solutions helps uncover faults and detect safety violations during the verification and testing process, resulting in the SMT sampling problem, i.e., constructing a small number of solutions while achieving comprehensive coverage of the constraint space. While high coverage is crucial for exploring system behaviors, reducing the number of solutions is of great importance, as excessive solutions increase testing time and resource usage, undermining efficiency. In this work, we introduce PanSampler, a novel SMT sampler that achieves high coverage with a small number of solutions. It incorporates three novel techniques, i.e., diversity-aware SMT algorithm, abstract syntax tree (AST)-guided scoring function and post-sampling optimization technology, enhancing its practical performance. It iteratively samples solutions, evaluates candidates, and employs local search to refine solutions, ensuring high coverage with a small number of samples. Extensive experiments on practical benchmarks demonstrate that PanSampler exhibits a significantly stronger capability to reach high target coverage, while requiring fewer solutions than current samplers to achieve the same coverage level. Furthermore, our empirical evaluation on practical subjects, which are collected from real-world software systems, shows that PanSampler achieves higher fault detection capability and reduces the number of required test cases from 32.6\% to 76.4\% to reach the same fault detection effectiveness, leading to a substantial improvement in testing efficiency. PanSampler advances SMT sampling, reducing the cost of software testing and hardware verification.
翻译:本研究聚焦于为可满足性模理论(SMT)公式高效生成多样化解,主要针对位向量、数组及未解释函数理论,这是软件与硬件测试中的关键任务。生成多样化的SMT解有助于在验证与测试过程中发现故障并检测安全违规,从而形成SMT采样问题,即在实现约束空间全面覆盖的同时构建少量解。虽然高覆盖率对于探索系统行为至关重要,但减少解的数量同样具有重要意义,因为过多的解会增加测试时间与资源消耗,降低效率。本文提出PanSampler——一种新型SMT采样器,能够以少量解实现高覆盖率。它融合了三种创新技术:多样性感知的SMT算法、抽象语法树(AST)引导的评分函数以及后采样优化技术,从而提升了实际性能。该方法通过迭代采样解、评估候选解并采用局部搜索优化解,确保以少量样本实现高覆盖率。在实用基准测试上的大量实验表明,PanSampler在达到相同覆盖率水平时,相较于现有采样器,展现出显著更强的目标覆盖率达成能力,且所需解的数量更少。此外,基于从实际软件系统收集的实证评估显示,PanSampler具备更高的故障检测能力,在达到相同故障检测效果时,所需测试用例数量减少了32.6%至76.4%,从而大幅提升了测试效率。PanSampler推动了SMT采样技术的发展,有效降低了软件测试与硬件验证的成本。