Deep neural networks (DNN) can achieve high performance when applied to In-Distribution (ID) data which come from the same distribution as the training set. When presented with anomaly inputs not from the ID, the outputs of a DNN should be regarded as meaningless. However, modern DNN often predict anomaly inputs as an ID class with high confidence, which is dangerous and misleading. In this work, we consider three classes of anomaly inputs, (1) natural inputs from a different distribution than the DNN is trained for, known as Out-of-Distribution (OOD) samples, (2) crafted inputs generated from ID by attackers, often known as adversarial (AD) samples, and (3) noise (NS) samples generated from meaningless data. We propose a framework that aims to detect all these anomalies for a pre-trained DNN. Unlike some of the existing works, our method does not require preprocessing of input data, nor is it dependent to any known OOD set or adversarial attack algorithm. Through extensive experiments over a variety of DNN models for the detection of aforementioned anomalies, we show that in most cases our method outperforms state-of-the-art anomaly detection methods in identifying all three classes of anomalies.
翻译:深神经网络 (DNN) 在应用与培训组相同分布的分布式数据时可以取得高性能。 当使用非来自身份识别的异常输入时, DNN 的输出应被视为毫无意义。 但是, 现代 DNN 通常以高度自信预测异常输入为身份识别类, 这是一种危险和误导性的工作。 我们认为, 有三类异常输入, (1) 与 DNN 不同分布式的自然输入被培训用于被称为 " 发售(OOOOD) " 的样本, (2) 由攻击者(通常被称为对抗(AD) 样本)的ID 生成的合成输入, 以及 (3) 由无意义的数据生成的噪音(NS) 样本。 我们提出一个框架, 目的是为事先培训的 DNN 检测所有这些异常。 与某些现有工程不同, 我们的方法不需要预先处理输入数据, 也不取决于已知的 OOD 设置或对抗性攻击算法。 通过对各种 DNN 检测上述异常现象的模型的广泛实验, 我们显示, 在多数情况下, 我们的方法超越了所有三类异常状态检测方法。