We address the problems of identifying malware in network telemetry logs and providing \emph{indicators of compromise} -- comprehensible explanations of behavioral patterns that identify the threat. In our system, an array of specialized detectors abstracts network-flow data into comprehensible \emph{network events} in a first step. We develop a neural network that processes this sequence of events and identifies specific threats, malware families and broad categories of malware. We then use the \emph{integrated-gradients} method to highlight events that jointly constitute the characteristic behavioral pattern of the threat. We compare network architectures based on CNNs, LSTMs, and transformers, and explore the efficacy of unsupervised pre-training experimentally on large-scale telemetry data. We demonstrate how this system detects njRAT and other malware based on behavioral patterns.
翻译:我们处理在网络遥测日志中识别恶意软件的问题,并提供\emph{妥协指标} -- -- 对识别威胁的行为模式的可理解解释。在我们的系统中,第一步将一系列专门的探测器摘要网络流数据输入可理解的\emph{网络事件}。我们开发一个神经网络,处理事件序列并查明具体威胁、恶意软件家庭和大类恶意软件。然后我们使用\emph{综合-梯度}方法来突出共同构成威胁典型行为模式的事件。我们比较基于CNN、LSTMs和变异器的网络结构,并探索大规模远程测量数据未经监督的实验前训练的功效。我们演示这个系统如何根据行为模式检测njRAT和其他恶意软件。