Community Search (CS) is one of the fundamental graph analysis tasks, which is a building block of various real applications. Given any query nodes, CS aims to find cohesive subgraphs that query nodes belong to. Recently, a large number of CS algorithms are designed. These algorithms adopt predefined subgraph patterns to model the communities, which cannot find ground-truth communities that do not have such pre-defined patterns in real-world graphs. Thereby, machine learning (ML) and deep learning (DL) based approaches are proposed to capture flexible community structures by learning from ground-truth communities in a data-driven fashion. These approaches rely on sufficient training data to provide enough generalization for ML models, however, the ground-truth cannot be comprehensively collected beforehand. In this paper, we study ML/DL-based approaches for CS, under the circumstance of small training data. Instead of directly fitting the small data, we extract prior knowledge which is shared across multiple CS tasks via learning a meta model. Each CS task is a graph with several queries that possess corresponding partial ground-truth. The meta model can be swiftly adapted to a task to be predicted by feeding a few task-specific training data. We find that trivially applying multiple classical metalearning algorithms to CS suffers from problems regarding prediction effectiveness, generalization capability and efficiency. To address such problems, we propose a novel meta-learning based framework, Conditional Graph Neural Process (CGNP), to fulfill the prior extraction and adaptation procedure. A meta CGNP model is a task-common node embedding function for clustering, learned by metric-based graph learning, which fully exploits the characteristics of CS. We compare CGNP with CS algorithms and ML baselines on real graphs with ground-truth communities.
翻译:社区搜索( CS) 是基本图表分析任务之一, 是各种真实应用程序的构建块。 根据任何查询节点, CS 的目标是找到属于查询节点的具有凝聚力的子集。 最近, 设计了大量 CS 算法。 这些算法采用了预先定义的子集模式来模拟社区, 这些社区在现实世界图形中找不到这种预先定义的模式。 由此, 机器学习( ML) 和深层次学习( DL) 方法, 以便通过以数据驱动的方式从地面真相社区学习来捕捉灵活的社区结构。 这些方法依靠足够的直接培训数据数据来为 ML 模型模型提供足够的一般性特征。 然而, 这些算法无法事先全面收集。 在本文中, 在小培训数据的情况下, 我们为 CS 研究基于地面的基于地面的模型/ DL 方法。 我们通过学习元数据模型模型, 我们从多重 CS 中共享的先前知识。 每个 CS 任务都是一张图表, 具有一些部分地面真相的查询 。 Sal deal ride rial comal commaalal ladeal comma dal dal dal dal dal dal dal dreal dreal dreald the 。</s>