Community detection is one of the most important methodological fields of network science, and one which has attracted a significant amount of attention over the past decades. This area deals with the automated division of a network into fundamental building blocks, with the objective of providing a summary of its large-scale structure. Despite its importance and widespread adoption, there is a noticeable gap between what is considered the state-of-the-art and the methods that are actually used in practice in a variety of fields. Here we attempt to address this discrepancy by dividing existing methods according to whether they have a "descriptive" or an "inferential" goal. While descriptive methods find patterns in networks based on intuitive notions of community structure, inferential methods articulate a precise generative model, and attempt to fit it to data. In this way, they are able to provide insights into the mechanisms of network formation, and separate structure from randomness in a manner supported by statistical evidence. We review how employing descriptive methods with inferential aims is riddled with pitfalls and misleading answers, and thus should be in general avoided. We argue that inferential methods are more typically aligned with clearer scientific questions, yield more robust results, and should be in many cases preferred. We attempt to dispel some myths and half-truths often believed when community detection is employed in practice, in an effort to improve both the use of such methods as well as the interpretation of their results.
翻译:社区探测是网络科学最重要的方法领域之一,在过去几十年中吸引了大量注意力。这个领域涉及将网络自动分割成基本构件,目的是提供其大规模结构的概况。尽管社区探测很重要,而且广泛采用,但认为最先进的技术与各领域实际采用的方法之间有明显差距。我们试图通过区分现有方法来消除这种差异,即现有方法是否具有“描述性”或“推断性”目标。虽然描述性方法在网络中找到基于社区结构的直观概念的模式,但推断性方法阐明精确的基因模型,并试图将其与数据相适应。这样,它们能够以统计证据支持的方式,对网络形成机制以及与随机性分开的结构提供深刻的见解。我们审查使用描述性方法的推断性目的如何被错误和误导性答案所迷惑,因此应当普遍避免。我们指出,一些推论方法在网络中发现模式更典型地以直观的社区结构概念为基础,并试图使其适应精确的基因模型。这样,我们通常会采用更精确的科学方法,在更精确的情况下,在更精确的情况下,更精确地利用这种方法来改进社区探测结果。我们所相信,在更精确的情况下,在更精确的情况下,在更精确地运用这种方法时,在更精确地加以改进的情况下,在更精确地加以改进的情况下,在更精确地加以研究。