Whereas Laplacian and modularity based spectral clustering is apt to dense graphs, recent results show that for sparse ones, the non-backtracking spectrum is the best candidate to find assortative clusters of nodes. Here belief propagation in the sparse stochastic block model is derived with arbitrary given model parameters that results in a non-linear system of equations; with linear approximation, the spectrum of the non-backtracking matrix is able to specify the number $k$ of clusters. Then the model parameters themselves can be estimated by the EM algorithm. Bond percolation in the assortative model is considered in the following two senses: the within- and between-cluster edge probabilities decrease with the number of nodes and edges coming into existence in this way are retained with probability $\beta$. As a consequence, the optimal $k$ is the number of the structural real eigenvalues (greater than $\sqrt{c}$, where $c$ is the average degree) of the non-backtracking matrix of the graph. Assuming, these eigenvalues $\mu_1 >\dots > \mu_k$ are distinct, the multiple phase transitions obtained for $\beta$ are $\beta_i =\frac{c}{\mu_i^2}$; further, at $\beta_i$ the number of detectable clusters is $i$, for $i=1,\dots ,k$. Inflation-deflation techniques are also discussed to classify the nodes themselves, which can be the base of the sparse spectral clustering.
翻译:暂无翻译