Most existing deep learning models are trained based on the closed-world assumption, where the test data is assumed to be drawn i.i.d. from the same distribution as the training data, known as in-distribution (ID). However, when models are deployed in an open-world scenario, test samples can be out-of-distribution (OOD) and therefore should be handled with caution. To detect such OOD samples drawn from unknown distribution, OOD detection has received increasing attention lately. However, current endeavors mostly focus on grid-structured data and its application for graph-structured data remains under-explored. Considering the fact that data labeling on graphs is commonly time-expensive and labor-intensive, in this work we study the problem of unsupervised graph OOD detection, aiming at detecting OOD graphs solely based on unlabeled ID data. To achieve this goal, we develop a new graph contrastive learning framework GOOD-D for detecting OOD graphs without using any ground-truth labels. By performing hierarchical contrastive learning on the augmented graphs generated by our perturbation-free graph data augmentation method, GOOD-D is able to capture the latent ID patterns and accurately detect OOD graphs based on the semantic inconsistency in different granularities (i.e., node-level, graph-level, and group-level). As a pioneering work in unsupervised graph-level OOD detection, we build a comprehensive benchmark to compare our proposed approach with different state-of-the-art methods. The experiment results demonstrate the superiority of our approach over different methods on various datasets.
翻译:大部分现有的深层次学习模型都是根据封闭世界假设培训的,根据这种假设,假设测试数据来自与培训数据相同的分布,称为分布式(ID)。然而,当模型在开放世界情景中部署时,测试样本可能超出分配范围(OOOD),因此应当谨慎处理。为检测从未知分布中提取的OOOD样本,OOD探测最近受到越来越多的关注。然而,目前的努力主要侧重于网格结构数据及其用于图形结构数据的应用,但探索不足。考虑到图表上的数据标签通常具有时间耗益和劳动力密集性,我们在此工作中研究未经监督的图形OOOD探测问题,目的是仅仅根据未加标签的ID数据探测OOOD图。为了实现这一目标,我们开发了新的图表对比学习框架GOD,不用任何地面真相标签来探测OOD图表。通过对由我们深度、不透度的图像水平生成的扩大的图表进行等级对比性学习,在不透度、不透度的ODGRO方法中,能够用不精确的图表水平、不精确的图像分析方法来测量OD。