Following the current big data trend, the scale of real-time system call traces generated by Linux applications in a contemporary data center may increase excessively. Due to the deficiency of scalability, it is challenging for traditional host-based intrusion detection systems deployed on every single host to collect, maintain, and manipulate those large-scale accumulated system call traces. It is inflexible to build data mining models on one physical host that has static computing capability and limited storage capacity. To address this issue, we propose SCADS, a corresponding solution using Apache Spark in the Google cloud environment. A set of Spark algorithms are developed to achieve the computational scalability. The experiment results demonstrate that the efficiency of intrusion detection can be enhanced, which indicates that the proposed method can apply to the design of next-generation host-based intrusion detection systems with system calls.
翻译:按照目前的大数据趋势,Linux应用在当代数据中心产生的实时系统呼叫痕迹的规模可能过大。由于缩放能力不足,每个主机都安装了传统的基于主机的入侵探测系统来收集、维护和操作这些大规模累积的系统呼叫痕迹,因此每个主机都难以收集、维护和操作这些大规模累积的系统呼叫痕迹。在一个具有静态计算能力和有限存储容量的物理主机上建立数据挖掘模型是没有弹性的。为了解决这个问题,我们提议使用SCADS,这是在谷歌云环境中使用Apache Spark的对应解决方案。开发了一套火花算法来实现计算缩放性。实验结果表明入侵探测的效率可以提高,这表明拟议的方法可以应用于设计下一代基于主机的、有系统呼叫的入侵探测系统。