PowerShell is a powerful and versatile task automation tool. Unfortunately, it is also widely abused by cyber attackers. To bypass malware detection and hinder threat analysis, attackers often employ diverse techniques to obfuscate malicious PowerShell scripts. Existing deobfuscation tools suffer from the limitation of static analysis, which fails to simulate the real deobfuscation process accurately. In this paper, we propose PowerPeeler. To the best of our knowledge, it is the first dynamic PowerShell script deobfuscation approach at the instruction level. It utilizes expression-related Abstract Syntax Tree (AST) nodes to identify potential obfuscated script pieces. Then, PowerPeeler correlates the AST nodes with their corresponding instructions and monitors the script's entire execution process. Subsequently, PowerPeeler dynamically tracks the execution of these instructions and records their execution results. Finally, PowerPeeler stringifies these results to replace the corresponding obfuscated script pieces and reconstruct the deobfuscated script. To evaluate the effectiveness of PowerPeeler, we collect 1,736,669 real-world malicious PowerShell samples with diversity obfuscation methods. We compare PowerPeeler with five state-of-the-art deobfuscation tools and GPT-4. The evaluation results demonstrate that PowerPeeler can effectively handle all well-known obfuscation methods. Additionally, the deobfuscation correctness rate of PowerPeeler reaches 95%, significantly surpassing that of other tools. PowerPeeler not only recovers the highest amount of sensitive data but also maintains a semantic consistency over 97%, which is also the best. Moreover, PowerPeeler effectively obtains the largest quantity of valid deobfuscated results within a limited time frame. Furthermore, PowerPeeler is extendable and can be used as a helpful tool for other cyber security solutions.
翻译:暂无翻译