Certified defenses are a recent development in adversarial machine learning (ML), which aim to rigorously guarantee the robustness of ML models to adversarial perturbations. A large body of work studies certified defenses in computer vision, where $\ell_p$ norm-bounded evasion attacks are adopted as a tractable threat model. However, this threat model has known limitations in vision, and is not applicable to other domains -- e.g., where inputs may be discrete or subject to complex constraints. Motivated by this gap, we study certified defenses for malware detection, a domain where attacks against ML-based systems are a real and current threat. We consider static malware detection systems that operate on byte-level data. Our certified defense is based on the approach of randomized smoothing which we adapt by: (1) replacing the standard Gaussian randomization scheme with a novel deletion randomization scheme that operates on bytes or chunks of an executable; and (2) deriving a certificate that measures robustness to evasion attacks in terms of generalized edit distance. To assess the size of robustness certificates that are achievable while maintaining high accuracy, we conduct experiments on malware datasets using a popular convolutional malware detection model, MalConv. We are able to accurately classify 91% of the inputs while being certifiably robust to any adversarial perturbations of edit distance 128 bytes or less. By comparison, an existing certification of up to 128 bytes of substitutions (without insertions or deletions) achieves an accuracy of 78%. In addition, given that robustness certificates are conservative, we evaluate practical robustness to several recently published evasion attacks and, in some cases, find robustness beyond certified guarantees.
翻译:认证国防是对抗性机器学习(ML)的最新动态,其目的是严格保证 ML 模型的稳健性,使其具有对抗性扰动。 大量的工作研究证明计算机视觉中的防御, 以$\ ell_ p$ p$ 规范限制的规避攻击作为可移动的威胁模式。 然而, 这一威胁模式在视觉上有已知的局限性, 不适用于其他领域 -- 例如投入可能离散或受到复杂限制。 受这一差距的驱动, 我们研究为错误软件检测而认证的经认证的防御, 这个领域是针对基于 ML 的系统发动袭击是真实和当前威胁的。 我们考虑在计算机视觉中建立固定的恶意检查系统。 我们的认证依据是随机化的平滑动方法, 我们适应的方法是:(1) 取代标准高调随机化计划, 并且不适用于新颖的随机化计划, 比如, 投入可能离散或受复杂限制, 例如, 投入的输入, 使用通用修正距离, 获得任何衡量, 是否可靠, 以规避攻击。 在近期, 以 可靠的方式, 可靠地, 以一些 准确性 准确性 的 准确性 的, 我们使用 更新 的 的 標列的 的 。