The problem of detecting a novel class at run time is known as Open Set Detection & is important for various real-world applications like medical application, autonomous driving, etc. Open Set Detection within context of deep learning involves solving two problems: (i) Must map the input images into a latent representation that contains enough information to detect the outliers, and (ii) Must learn an anomaly scoring function that can extract this information from the latent representation to identify the anomalies. Research in deep anomaly detection methods has progressed slowly. One reason may be that most papers simultaneously introduce new representation learning techniques and new anomaly scoring approaches. The goal of this work is to improve this methodology by providing ways of separately measuring the effectiveness of the representation learning and anomaly scoring. This work makes two methodological contributions. The first is to introduce the notion of Oracle anomaly detection for quantifying the information available in a learned latent representation. The second is to introduce Oracle representation learning, which produces a representation that is guaranteed to be sufficient for accurate anomaly detection. These two techniques help researchers to separate the quality of the learned representation from the performance of the anomaly scoring mechanism so that they can debug and improve their systems. The methods also provide an upper limit on how much open category detection can be improved through better anomaly scoring mechanisms. The combination of the two oracles gives an upper limit on the performance that any open category detection method could achieve. This work introduces these two oracle techniques and demonstrates their utility by applying them to several leading open category detection methods.
翻译:在运行时发现一个新类的问题被称为 Open Set 检测(Open Set Set ),对于医学应用、自主驾驶等各种现实世界应用来说非常重要。 在深层次学习的背景下, Open Set 检测(Open Set ) 需要解决两个问题:(一) 必须将输入图像映射成一个包含足够信息的潜在代表体,以探测异常值;(二) 必须学习一个异常评分功能,从潜在代表体中提取这些信息,以识别异常值; 深层异常度检测方法的研究进展缓慢。 其中一个原因可能是大多数文件同时引入新的代表性学习技巧和新的异常评分方法。 这项工作的目标是通过提供分别衡量代表性学习和异常评分有效性的方法来改进这一方法。 这项工作提供了两种方法的贡献。 第一个是引入Oracle异常值检测概念,以量化已知潜在代表体中的信息; 第二个是引入Oracle 演算方法,保证其代表性足以准确地检测异常值。 这两种技术有助于研究人员通过异常评分机制的性能分出公开代表质量,以便调和改进其系统。 这种方法还可以在两种方法的高级测试方法上展示一种方法。