Deep learning models have been used for a wide variety of tasks. They are prevalent in computer vision, natural language processing, speech recognition, and other areas. While these models have worked well under many scenarios, it has been shown that they are vulnerable to adversarial attacks. This has led to a proliferation of research into ways that such attacks could be identified and/or defended against. Our goal is to explore the contribution that can be attributed to using multiple underlying models for the purpose of adversarial instance detection. Our paper describes two approaches that incorporate representations from multiple models for detecting adversarial examples. We devise controlled experiments for measuring the detection impact of incrementally utilizing additional models. For many of the scenarios we consider, the results show that performance increases with the number of underlying models used for extracting representations.
翻译:深层学习模式被广泛用于各种各样的任务,在计算机视觉、自然语言处理、语音识别和其他领域十分普遍。这些模式在很多情况下运作良好,但已经表明它们很容易受到对抗性攻击的伤害。这导致大量研究如何识别和/或防御这种攻击。我们的目标是探讨利用多种基础模型来探测对抗性实例的贡献。我们的文件描述了两种方法,其中结合了多种模型在发现对抗性实例方面的表述。我们设计了有控制的实验,以测量利用更多模型逐步检测的探测影响。对于我们所考虑的许多假设,结果显示,在利用基本模型来提取陈述时,性能会随着基本模型的数量而提高。