Maintaining security and privacy in real-world enterprise networks is becoming more and more challenging. Cyber actors are increasingly employing previously unreported and state-of-the-art techniques to break into corporate networks. To develop novel and effective methods to thwart these sophisticated cyberattacks, we need datasets that reflect real-world enterprise scenarios to a high degree of accuracy. However, precious few such datasets are publicly available. Researchers still predominantly use the decade-old KDD datasets, however, studies showed that these datasets do not adequately reflect modern attacks like Advanced Persistent Threats(APT). In this work, we analyze the usefulness of the recently introduced DARPA Operationally Transparent Cyber (OpTC) dataset in this regard. We describe the content of the dataset in detail and present a qualitative analysis. We show that the OpTC dataset is an excellent candidate for advanced cyber threat detection research while also highlighting its limitations. Additionally, we propose several research directions where this dataset can be useful.
翻译:维护真实世界企业网络的安全和隐私正变得越来越具有挑战性。网络行为体正越来越多地使用先前未报告和最先进的技术破解公司网络。为了开发新的有效方法来挫败这些复杂的网络攻击,我们需要非常精确地反映真实世界企业情景的数据集。然而,很少有宝贵的这类数据集可供公开查阅。研究人员仍然主要使用十年之久的KDD数据集。但研究表明,这些数据集没有充分反映现代攻击,如高级持久性有机污染物(APT)等。在这项工作中,我们分析了最近推出的DARPA操作透明网络数据集在这方面的有用性。我们详细描述数据集的内容并进行定性分析。我们表明,OPTC数据集是先进网络威胁探测研究的优秀候选数据,同时也强调了其局限性。此外,我们提出了若干研究方向,供这一数据集使用。