With the increasing prevalence of encrypted network traffic, cyber security analysts have been turning to machine learning (ML) techniques to elucidate the traffic on their networks. However, ML models can become stale as new traffic emerges that is outside of the distribution of the training set. In order to reliably adapt in this dynamic environment, ML models must additionally provide contextualized uncertainty quantification to their predictions, which has received little attention in the cyber security domain. Uncertainty quantification is necessary both to signal when the model is uncertain about which class to choose in its label assignment and when the traffic is not likely to belong to any pre-trained classes. We present a new, public dataset of network traffic that includes labeled, Virtual Private Network (VPN)-encrypted network traffic generated by 10 applications and corresponding to 5 application categories. We also present an ML framework that is designed to rapidly train with modest data requirements and provide both calibrated, predictive probabilities as well as an interpretable "out-of-distribution" (OOD) score to flag novel traffic samples. We describe calibrating OOD scores using p-values of the relative Mahalanobis distance. We demonstrate that our framework achieves an F1 score of 0.98 on our dataset and that it can extend to an enterprise network by testing the model: (1) on data from similar applications, (2) on dissimilar application traffic from an existing category, and (3) on application traffic from a new category. The model correctly flags uncertain traffic and, upon retraining, accurately incorporates the new data.
翻译:随着加密网络交通日益普遍,网络安全分析人员一直在转向机器学习(ML)技术,以阐明其网络上的交通情况。然而,随着新的交通情况出现,在培训数据集分布之外出现新的交通情况,ML模型可能会变得老化。为了可靠地适应这种动态环境,ML模型还必须为其预测提供背景化的不确定性量化,而这种预测在网络安全领域很少受到注意。不确定的量化对于在模型无法确定其标签任务选择哪个类别时,以及交通可能不属于任何预先培训的类别时,都有必要发出信号。我们展示了一个新的公开的网络交通数据集,其中包括由10个应用程序生成并相当于5个应用类别的标签、虚拟私人网络(VPN)加密的网络交通情况。我们还提供了一个ML框架,旨在快速培训数据要求不多,提供经过校准、预测的概率,以及可解释的“分配之外”的分数,以及交通流量可能不属于任何预先培训的类别。我们用的是,我们描述在网络应用中校准OD的分数,使用由10个应用程序生成的虚拟私基网络(VP)的虚拟数字值,1到类似于马哈萨诺标准的新的数据测试。我们可以从一个现有数据库数据分类,从一个从一个比值到一个现有数据测试。</s>