To tackle the susceptibility of deep neural networks to adversarial examples, the adversarial training has been proposed which provides a notion of security through an inner maximization problem presenting the first-order adversaries embedded within the outer minimization of the training loss. To generalize the adversarial robustness over different perturbation types, the adversarial training method has been augmented with the improved inner maximization presenting a union of multiple perturbations e.g., various $\ell_p$ norm-bounded perturbations. However, the improved inner maximization only enjoys limited flexibility in terms of the allowable perturbation types. In this work, through a gating mechanism, we assemble a set of expert networks, each one either adversarially trained to deal with a particular perturbation type or normally trained for boosting accuracy on clean data. The gating module assigns weights dynamically to each expert to achieve superior accuracy under various data types e.g., adversarial examples, adverse weather perturbations, and clean input. In order to deal with the obfuscated gradients issue, the training of the gating module is conducted together with fine-tuning of the last fully connected layers of expert networks through adversarial training approach. Using extensive experiments, we show that our Mixture of Robust Experts (MoRE) approach enables flexible integration of a broad range of robust experts with superior performance.
翻译:为解决深神经网络易受对抗性攻击的例子的影响,提出了对抗性培训建议,通过内在最大化问题提供一种安全概念,提出将培训损失降到最低程度的内在最大程度; 为了对不同扰动类型进行一般的对抗性强力,对敌对性培训方法进行了扩大,对各种扰动类型进行了一般化的强化内部最大化,对多种扰动组合,例如,各种以标准为基准的干扰组合,如,美元/美元/美元/美元/美元/标准调制的干扰。然而,在可允许的扰动类型方面,改进的内部最大化仅具有有限的灵活性。在这项工作中,我们建立了一套专家网络,每个网络都经过对抗性培训,处理特定扰动类型的问题,或通常受过提高清洁数据准确性的培训。 配制模块向每位专家动态地分配权重,以便在各种数据类型下实现更精确的准确性,例如,对抗性实例、恶劣的天气扰动性扰动和清洁的投入。为了处理被混淆的梯度问题,我们通过一种格化机制,培训一套专家的广泛竞争性培训模块,同时进行我们通过精细化的专家网络进行广泛的高级的实验。