In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models. This is partially due to the fact that obtaining a balanced, diverse, and perfectly labeled dataset is typically expensive, time-consuming, and error-prone. Rather than relying on a carefully designed test set to assess ML models' failures, fairness, or robustness, this paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images to allow model diagnosis, interpretability, and robustness. Traditional adversarial training is a popular methodology for robustifying ML models against attacks. However, existing adversarial methods do not combine the two aspects that enable the interpretation and analysis of the model's flaws: semantic traceability and perceptual quality. SIA combines the two features via iterative gradient ascent on a predefined semantic attribute space and the image space. We illustrate the validity of our approach in three scenarios for keypoint detection and classification. (1) Model diagnosis: SIA generates a histogram of attributes that highlights the semantic vulnerability of the ML model (i.e., attributes that make the model fail). (2) Stronger attacks: SIA generates adversarial examples with visually interpretable attributes that lead to higher attack success rates than baseline methods. The adversarial training on SIA improves the transferable robustness across different gradient-based attacks. (3) Robustness to imbalanced datasets: we use SIA to augment the underrepresented classes, which outperforms strong augmentation and re-balancing baselines.
翻译:在实践中,针对特定的训练集和测试集进行的度量分析并不能保证可靠或公平的ML模型。这部分是由于获得平衡、多样性和完美标记的数据集通常是昂贵、耗时和容易出错的。本文提出了基于敌对攻击的语义图像攻击(SIA)方法,以提供语义对抗图像,以实现模型诊断、可解释性和健壮性。传统的敌对训练是一种用于增强ML模型对攻击的鲁棒性的流行方法。然而,现有的敌对方法没有结合能够实现对模型缺陷解释和分析的两个方面:语义可追溯性和感知质量。SIA通过在预定义的语义属性空间和图像空间上进行迭代梯度上升,结合了两个特征。我们在关键点检测和分类的三种场景下说明了我们方法的有效性。(1)模型诊断:SIA生成了一个属性直方图,突出了ML模型的语义漏洞(即使模型失败的属性)。(2)更强的攻击:SIA生成具有可视化属性的对抗性示例,其攻击成功率高于基线方法。在SIA上的敌对训练可以提高不同基于梯度的攻击之间的可传递的健壮性。(3)对不平衡数据集的健壮性:我们使用SIA增强了下采样类的数据,其表现优于强大的增强和重新平衡的基线。