Assessing the exploitability of software vulnerabilities at the time of disclosure is difficult and error-prone, as features extracted via technical analysis by existing metrics are poor predictors for exploit development. Moreover, exploitability assessments suffer from a class bias because "not exploitable" labels could be inaccurate. To overcome these challenges, we propose a new metric, called Expected Exploitability (EE), which reflects, over time, the likelihood that functional exploits will be developed. Key to our solution is a time-varying view of exploitability, a departure from existing metrics. This allows us to learn EE using data-driven techniques from artifacts published after disclosure, such as technical write-ups and proof-of-concept exploits, for which we design novel feature sets. This view also allows us to investigate the effect of the label biases on the classifiers. We characterize the noise-generating process for exploit prediction, showing that our problem is subject to the most challenging type of label noise, and propose techniques to learn EE in the presence of noise. On a dataset of 103,137 vulnerabilities, we show that EE increases precision from 49% to 86% over existing metrics, including two state-of-the-art exploit classifiers, while its precision substantially improves over time. We also highlight the practical utility of EE for predicting imminent exploits and prioritizing critical vulnerabilities. We develop EE into an online platform which is publicly available at https://exploitability.app/.
翻译:评估披露时软件脆弱性的可开发性是困难和容易出错的,因为通过现有指标的技术分析得出的特征对开发开发的预测不力。此外,可开发性评估还存在阶级偏差,因为“不可开发”标签可能不准确。为了克服这些挑战,我们提出了名为“预期可开发性(EE)”的新指标,它反映了长期内开发功能开发的可能性。我们解决方案的关键是,对可开发性有时间差异,与现有指标不同。这使我们能够从披露后公布的艺术品中学习由数据驱动的技术技术,例如技术写作和验证概念开发技术,从而了解EEE。我们为此设计了新的特性组。我们还可以调查标签偏见对分类者的影响。我们为利用预测而描述产生噪音的过程,表明我们的问题受到最具挑战性的标签噪音的影响,并提出了在噪音面前学习EE应用的技术。关于103、137脆弱性的数据集,我们显示EEE的精确度从49%提高到86 %,同时对现有的E的准确性进行快速的利用。我们还可以将E的精确度提升到现有的E的精确度,包括将现有的E的精确度。