Open source software (OSS) vulnerability management has become an open problem. Vulnerability databases provide valuable data that is needed to address OSS vulnerabilities. However, there arises a growing concern about the information quality of vulnerability databases. In particular, it is unclear how the quality of patches in existing vulnerability databases is. Further, existing manual or heuristic-based approaches for patch identification are either too expensive or too specific to be applied to all OSS vulnerabilities. To address these problems, we first conduct an empirical study to understand the quality and characteristics of patches for OSS vulnerabilities in two state-of-the-art vulnerability databases. Our study is designed to cover five dimensions, i.e., the coverage, consistency, type, cardinality and accuracy of patches. Then, inspired by our study, we propose the first automated approach, named TRACER, to find patches for an OSS vulnerability from multiple sources. Our key idea is that patch commits will be frequently referenced during the reporting, discussion and resolution of an OSS vulnerability. Our extensive evaluation has indicated that i) TRACER finds patches for up to 273.8% more CVEs than existing heuristic-based approaches while achieving a significantly higher F1-score by up to 116.8%; and ii) TRACER achieves a higher recall by up to 18.4% than state-of-the-art vulnerability databases, but sacrifices up to 12.0% fewer CVEs (whose patches are not found) and 6.4% lower precision. Our evaluation has also demonstrated the generality and usefulness of TRACER.
翻译:开放源码脆弱性管理已经成为一个开放源码脆弱性管理的问题。 脆弱性数据库提供了解决开放源码软件脆弱性问题所需要的宝贵数据。 然而,人们越来越关注脆弱性数据库的信息质量。 特别是,现有脆弱性数据库中的补丁质量如何,目前尚不清楚。 此外,现有的人工或基于通识的补丁识别方法过于昂贵或过于具体,无法适用于所有开放源码软件脆弱性。为了解决这些问题,我们首先进行实证研究,以了解两个最先进的脆弱性数据库中开放源码软件脆弱性补丁的质量和特点。我们的研究旨在涵盖五个层面,即脆弱性数据库的覆盖面、一致性、类型、基点和准确性。然后,在我们的研究启发下,我们提出第一个自动化方法,即称为TRACER,为开放源码软件脆弱性的补丁太昂贵,无法应用于所有开放源码软件脆弱性的报告、讨论和解决过程中。 我们的广泛评估表明,TRACER发现,在273.8%方面,TRACER的准确性程度比现有的超高。