In malware behavioral analysis, the list of accessed and created files very often indicates whether the examined file is malicious or benign. However, malware authors are trying to avoid detection by generating random filenames and/or modifying used filenames with new versions of the malware. These changes represent real-world adversarial examples. The goal of this work is to generate realistic adversarial examples and improve the classifier's robustness against these attacks. Our approach learns latent representations of input strings in an unsupervised fashion and uses gradient-based adversarial attack methods in the latent domain to generate adversarial examples in the input domain. We use these examples to improve the classifier's robustness by training on the generated adversarial set of strings. Compared to classifiers trained only on perturbed latent vectors, our approach produces classifiers that are significantly more robust without a large trade-off in standard accuracy.
翻译:在恶意软件行为分析中,访问和创建的文档列表往往显示被检查的文件是否恶意或无害。 但是, 恶意软件作者试图通过生成随机文件名和(或)用新版本的恶意软件修改旧文件名来避免被检测。 这些变化代表了真实世界的对抗性实例。 这项工作的目标是生成现实的对抗性实例,提高分类者对这些攻击的稳健性。 我们的方法是以不受监督的方式学习输入字符串的潜在表达方式,并在潜在域中使用基于梯度的对抗性攻击方法来生成输入域内的对抗性实例。 我们利用这些实例来通过对生成的对抗性字符串进行培训来提高分类者的稳健性。 与只对潜伏的矢量进行培训的分类者相比, 我们的方法产生在标准精确度上没有大宗交易的分类者更加稳健。