We study the behavior of a label propagation algorithm (LPA) on the Erd\H{o}s-R\'enyi random graph $\mathcal{G}(n,p)$. Initially, given a network, each vertex starts with a random label in the interval $[0,1]$. Then, in each round of LPA, every vertex switches its label to the majority label in its neighborhood (including its own label). At the first round, ties are broken towards smaller labels, while at each of the next rounds, ties are broken uniformly at random. The algorithm terminates once all labels stay the same in two consecutive iterations. LPA is successfully used in practice for detecting communities in networks (corresponding to vertex sets with the same label after termination of the algorithm). Perhaps surprisingly, LPA's performance on dense random graphs is hard to analyze, and so far convergence to consensus was known only when $np\ge n^{3/4+\varepsilon}$, where LPA converges in three rounds. By defining an alternative label attribution procedure which converges to the label propagation algorithm after three rounds, a careful multi-stage exposure of the edges allows us to break the $n^{3/4+\varepsilon}$ barrier and show that, when $np \ge n^{5/8+\varepsilon}$, a.a.s.\ the algorithm terminates with a single label. Moreover, we show that, if $np\gg n^{2/3}$, a.a.s.\ this label is the smallest one, whereas if $n^{5/8+\varepsilon}\le np\ll n^{2/3}$, the surviving label is a.a.s.\ not the smallest one. En passant, we show a presumably new monotonicity lemma for Binomial random variables that might be of independent interest.
翻译:暂无翻译