With the celebrated success of deep learning, some attempts to develop effective methods for detecting malicious PowerShell programs employ neural nets in a traditional natural language processing setup while others employ convolutional neural nets to detect obfuscated malicious commands at a character level. While these representations may express salient PowerShell properties, our hypothesis is that tools from static program analysis will be more effective. We propose a hybrid approach combining traditional program analysis (in the form of abstract syntax trees) and deep learning. This poster presents preliminary results of a fundamental step in our approach: learning embeddings for nodes of PowerShell ASTs. We classify malicious scripts by family type and explore embedded program vector representations.
翻译:深思熟虑取得了令人瞩目的成功,一些旨在开发有效方法以探测恶意电壳方案的努力在传统的自然语言处理装置中使用神经网,而另一些则使用进化神经网在字符层面检测模糊的恶意指令。虽然这些表达方式可能显示显著的PowerShell特性,但我们的假设是,静态方案分析工具将更加有效。我们提出了一种混合方法,将传统方案分析(抽象合成树形式)和深思熟虑结合起来。这幅海报展示了我们方法中一个根本步骤的初步结果:学习嵌入PowerShell AST的节点。我们按家庭类型对恶意脚本进行分类,并探索嵌入的方案矢量表达方式。