Handling out-of-distribution (OOD) samples has become a major stake in the real-world deployment of machine learning systems. This work explores the application of self-supervised contrastive learning to the simultaneous detection of two types of OOD samples: unseen classes and adversarial perturbations. Since in practice the distribution of such samples is not known in advance, we do not assume access to OOD examples. We first show that similarity functions trained with contrastive learning can be leveraged with the maximum mean discrepancy (MMD) two-sample test to verify whether two independent sets of samples are drawn from the same distribution. Inspired by this approach, we introduce CADet (Contrastive Anomaly Detection), a method based on contrastive transformations to perform anomaly detection on single samples. CADet compares favorably to adversarial detection methods to detect adversarially perturbed samples on ImageNet. Simultaneously, it achieves comparable performance to unseen label detection methods on two challenging benchmarks: ImageNet-O and iNaturalist. CADet is fully self-supervised and requires neither labels for in-distribution samples nor access to OOD examples.
翻译:在实际部署机器学习系统时,处理分配(OOD)样本已成为世界实际部署机器学习系统的重大利害关系。这项工作探索了自我监督对比学习的应用,以同时检测两种OOD样本:隐蔽的类别和对抗性扰动。由于在实际中这些样本的分布事先不为人所知,我们不假定使用OOOD实例。我们首先表明,经过对比学习培训的类似功能可以用最大平均差异(MMD)双层抽样测试来利用,以核实是否从同一分布中提取了两套独立的样本。在这种方法的启发下,我们采用了CADet(随机异常检测)这一基于对比变异的方法,以对单个样本进行异常检测。CADet优于对抗性检测方法,以探测图像网络上的对立性扰动样品。同时,它在两个具有挑战性的基准(图像网-O和inaturallist)上实现了与隐形标签检测方法相似的性表现。CADet是完全自我监督的,不需要在批发样品或OOD示例上贴标签。