对受限制环境中神经网络的袭击 (FENCE: Feasible Evasion Attacks on Neural Networks in Constrained Environments)

As advances in Deep Neural Networks (DNNs) demonstrate unprecedented levels of performance in many critical applications, their vulnerability to attacks is still an open question. We consider evasion attacks at testing time against Deep Learning in constrained environments, in which dependencies between features need to be satisfied. These situations may arise naturally in tabular data or may be the result of feature engineering in specific application domains, such as threat detection in cyber security. We propose a general iterative gradient-based framework called FENCE for crafting evasion attacks that take into consideration the specifics of constrained domains and application requirements. We apply it against Feed-Forward Neural Networks trained for two cyber security applications: network traffic botnet classification and malicious domain classification, to generate feasible adversarial examples. We extensively evaluate the success rate and performance of our attacks, compare their improvement over several baselines, and analyze factors that impact the attack success rate, including the optimization objective and the data imbalance. We show that with minimal effort (e.g., generating 12 additional network connections), an attacker can change the model's prediction from the Malicious class to Benign and evade the classifier. We show that models trained on datasets with higher imbalance are more vulnerable to our FENCE attacks. Finally, we demonstrate the potential of performing adversarial training in constrained domains to increase the model resilience against these evasion attacks.

翻译：由于深神经网络的进步在许多关键应用领域表现出了前所未有的业绩水平,因此它们易受攻击仍然是一个未决问题。我们考虑在测试时间对条件有限的环境中的深学习进行规避攻击,需要满足不同特点之间的依赖性。这些情况可能自然地出现在表格数据中,或者可能是特定应用领域(如网络安全的威胁探测)的特征工程的结果。我们提议了一个一般的迭代梯度基于梯度的框架,称为 " 编造逃避攻击的编造者 ",以考虑到受限制的领域和应用要求的具体特点。我们用它来对付在网络交通网分类和恶意域分类这两个网络安全应用中受过训练的 " 进餐前进神经网络 ",以产生可行的对抗性实例。我们广泛评价我们攻击的成功率和性能,对照几个基线比较其改进情况,分析影响攻击成功率的因素,包括优化目标和数据不平衡。我们指出,只要作出最小的努力(例如,产生12个额外的网络连接),攻击者就可以将模型的预测从恶意类改为贝尼,并避开分类者。我们展示了经过训练的防御性网络网络分类的两种网络网络网络网络网络网络网络网络网络网络网络,以产生可行的对抗模型,以便产生可行的对抗性实例。我们广泛评估攻击的成功性攻击的模型,以显示,我们进行更激烈攻击的防御性攻击的防御性攻击的可能性。我们最后显示,进行较强的防御性攻击的防御性攻击。