Structural balance theory studies stability in networks. Given a $n$-vertex complete graph $G=(V,E)$ whose edges are labeled positive or negative, the graph is considered \emph{balanced} if every triangle either consists of three positive edges (three mutual ``friends''), or one positive edge and two negative edges (two ``friends'' with a common ``enemy''). From a computational perspective, structural balance turns out to be a special case of correlation clustering with the number of clusters at most two. The two main algorithmic problems of interest are: $(i)$ detecting whether a given graph is balanced, or $(ii)$ finding a partition that approximates the \emph{frustration index}, i.e., the minimum number of edge flips that turn the graph balanced. We study these problems in the streaming model where edges are given one by one and focus on \emph{memory efficiency}. We provide randomized single-pass algorithms for: $(i)$ determining whether an input graph is balanced with $O(\log{n})$ memory, and $(ii)$ finding a partition that induces a $(1 + \varepsilon)$-approximation to the frustration index with $O(n \cdot \text{polylog}(n))$ memory. We further provide several new lower bounds, complementing different aspects of our algorithms such as the need for randomization or approximation. To obtain our main results, we develop a method using pseudorandom generators (PRGs) to sample edges between independently-chosen \emph{vertices} in graph streaming. Furthermore, our algorithm that approximates the frustration index improves the running time of the state-of-the-art correlation clustering with two clusters (Giotis-Guruswami algorithm [SODA 2006]) from $n^{O(1/\varepsilon^2)}$ to $O(n^2\log^3{n}/\varepsilon^2 + n\log n \cdot (1/\varepsilon)^{O(1/\varepsilon^4)})$ time for $(1+\varepsilon)$-approximation. These results may be of independent interest.
翻译:暂无翻译