岩浆 Ising 区块模型的精确回收和尖锐阈值 (Exact recovery and sharp thresholds of Stochastic Ising Block Model)

from arxiv, Fixed a gap in the original proof of Theorem 5. The new proof of Theorem 5 relies on Lemma 5, which is the main new element in this version

The stochastic block model (SBM) is a random graph model in which the edges are generated according to the underlying cluster structure on the vertices. The (ferromagnetic) Ising model, on the other hand, assigns $\pm 1$ labels to vertices according to an underlying graph structure in a way that if two vertices are connected in the graph then they are more likely to be assigned the same label. In SBM, one aims to recover the underlying clusters from the graph structure while in Ising model, an extensively-studied problem is to recover the underlying graph structure based on i.i.d. samples (labelings of the vertices). In this paper, we propose a natural composition of SBM and the Ising model, which we call the Stochastic Ising Block Model (SIBM). In SIBM, we take SBM in its simplest form, where $n$ vertices are divided into two equal-sized clusters and the edges are connected independently with probability $p$ within clusters and $q$ across clusters. Then we use the graph $G$ generated by the SBM as the underlying graph of the Ising model and draw $m$ i.i.d. samples from it. The objective is to exactly recover the two clusters in SBM from the samples generated by the Ising model, without observing the graph $G$. As the main result of this paper, we establish a sharp threshold $m^\ast$ on the sample complexity of this exact recovery problem in a properly chosen regime, where $m^\ast$ can be calculated from the parameters of SIBM. We show that when $m\ge m^\ast$, one can recover the clusters from $m$ samples in $O(n)$ time as the number of vertices $n$ goes to infinity. When $m<m^\ast$, we further show that for almost all choices of parameters of SIBM, the success probability of any recovery algorithms approaches $0$ as $n\to\infty$.

翻译：在 SBM 中, 我们的目标是从图形结构中回收基团。在Ising 模型中, 一个被广泛研究的问题是要从i.i.d. 样本中恢复基团结构。在本文中, 我们建议根据基本图形结构将1美元的标签分配给脊椎, 如果在图形中连接两个顶端, 那么它们就更有可能被分配到相同的标签。在 SBM 中, 我们的目标是从图形结构中回收基团的基团。在Ism 模型中, 一个被广泛研究的问题是根据i. i.d. 样本中的基本组结构( 标注) 。在本文中, 我们建议SBM 和Ising 模型的自然构成, 我们称之为Schacast Ism 模型。在SBM 中, 我们将SBM 的底部数据数分为两个等值的基团。我们的底部和边缘独立连接到 $美元, 美元基组中, 我们从这个基组的底基组中, 我们用S& 美元来计算一个基组的底数。