The number of newly published vulnerabilities is constantly increasing. Until now, the information available when a new vulnerability is published is manually assessed by experts using a Common Vulnerability Scoring System (CVSS) vector and score. This assessment is time consuming and requires expertise. Various works already try to predict CVSS vectors or scores using machine learning based on the textual descriptions of the vulnerability to enable faster assessment. However, for this purpose, previous works only use the texts available in databases such as National Vulnerability Database. With this work, the publicly available web pages referenced in the National Vulnerability Database are analyzed and made available as sources of texts through web scraping. A Deep Learning based method for predicting the CVSS vector is implemented and evaluated. The present work provides a classification of the National Vulnerability Database's reference texts based on the suitability and crawlability of their texts. While we identified the overall influence of the additional texts is negligible, we outperformed the state-of-the-art with our Deep Learning prediction models.
翻译:新公布的脆弱程度数量在不断增加。到目前为止,在公布新的脆弱程度时可获得的信息由专家使用共同脆弱程度测算系统(CVSS)矢量和分数进行手工评估。这种评估耗时且需要专门知识。各种工作已经试图利用对脆弱程度的文字描述来预测CVSS矢量或分数,以便进行更快的评估。然而,为此目的,以前的工作只使用国家脆弱程度数据库等数据库中的现有文本。通过这项工作,对国家脆弱程度数据库中引用的公开网页进行了分析,并通过网络报废作为文本来源提供。基于深度学习的预测CVSS矢量的方法得到实施和评价。目前的工作根据国家脆弱程度数据库的文本的适宜性和可爬行性,对国家脆弱程度数据库参考文本进行了分类。虽然我们查明了其他文本的总体影响是微不足道的,但我们用深学习预测模型比最新技术。