Finding dense communities in networks is a widely-used tool for analysis in graph mining. A popular choice for finding such communities is to find subgraphs with a high average degree. While useful, interpreting such subgraphs may be difficult. On the other hand, many real-world networks have additional information, and we are specifically interested in networks that have labels on edges. In this paper, we study finding dense subgraphs that can be explained with the labels on edges. More specifically, we are looking for a set of labels so that the induced subgraph has a high average degree. There are many ways to induce a subgraph from a set of labels, and we study two cases: First, we study conjunctive-induced dense subgraphs, where the subgraph edges need to have all labels. Secondly, we study disjunctive-induced dense subgraphs, where the subgraph edges need to have at least one label. We show that both problems are $\textbf{NP}$-hard. Because of the hardness, we resort to greedy heuristics. We show that we can implement the greedy search efficiently: the respective running times for finding conjunctive-induced and disjunctive-induced dense subgraphs are in $\mathcal{O}(p \log k)$ and $\mathcal{O}(p \log^2 k)$, where $p$ is the number of edge-label pairs and $k$ is the number of labels. Our experimental evaluation demonstrates that we can find the ground truth in synthetic graphs and that we can find interpretable subgraphs from real-world networks.
翻译:在网络中查找密度较大的社区是一个广泛使用的图表采矿分析工具。 查找此类社区的流行选择是找到高平均度的子图。 虽然有用, 解释此类子图可能很困难。 另一方面, 许多真实世界的网络拥有更多信息, 我们特别感兴趣的网络中带有边缘标签的网络。 在本文中, 我们研究能找到可以用边缘标签解释的密度的子图。 更具体地说, 我们正在寻找一组标签, 这样导出子图具有较高的平均度。 有很多方法从一组标签中引出子图, 而我们研究两个案例: 首先, 我们研究同源引致密度的子图, 那里的子图边缘需要所有标签。 其次, 我们研究离导引导的密密的密子图, 那里的子图边缘需要至少有一个标签。 我们发现两个问题都是$( textbf{ NP} 基底值 。 由于硬性评估, 我们使用贪婪的金质 。 我们发现, 我们的基调的基数 和基质的基质的基数是, 我们的基质的基质的基质的基数 。