Many users implicitly assume that software can only be exploited after it is installed. However, recent supply-chain attacks demonstrate that application integrity must be ensured during installation itself. We introduce SIGL, a new tool for detecting malicious behavior during software installation. SIGL collects traces of system call activity, building a data provenance graph that it analyzes using a novel autoencoder architecture with a graph long short-term memory network (graph LSTM) for the encoder and a standard multilayer perceptron for the decoder. SIGL flags suspicious installations as well as the specific installation-time processes that are likely to be malicious. Using a test corpus of 625 malicious installers containing real-world malware, we demonstrate that SIGL has a detection accuracy of 96%, outperforming similar systems from industry and academia by up to 87% in precision and recall and 45% in accuracy. We also demonstrate that SIGL can pinpoint the processes most likely to have triggered malicious behavior, works on different audit platforms and operating systems, and is robust to training data contamination and adversarial attack. It can be used with application-specific models, even in the presence of new software versions, as well as application-agnostic meta-models that encompass a wide range of applications and installers.
翻译:许多用户暗含地认为软件只能在安装后才能开发。 但是, 最近的供应链攻击表明, 安装过程中必须确保应用程序的完整性。 我们引入了SIGL, 这是在软件安装过程中用于检测恶意行为的新工具。 SIGL 收集系统呼叫活动的痕迹, 建立数据导出图, 用于使用具有图解长短期内存网络( graph LSTM ) 和解码器标准多层透视器的新自动自动编码器结构进行分析 。 SIGL 标记可疑的装置以及可能恶意的特定安装时间程序。 我们使用包含真实世界恶意软件软件的625个恶意安装器的测试堆, 我们证明SIGL 具有96%的检测准确性, 其运行在行业和学术界的类似系统上达到87%的精确度和回顾率以及45%的准确度。 我们还表明, SIGL 能够定位最有可能触发恶意行为的程序, 在不同的审计平台和操作系统上工作, 并且能够对数据污染和对抗性攻击进行训练。 它可以与应用程序具体模型一起使用,, 即使是在新软件版本和新版本的多种应用中, 。