We study the fundamental problem of butterfly (i.e. (2,2)-bicliques) counting in bipartite streaming graphs. Similar to triangles in unipartite graphs, enumerating butterflies is crucial in understanding the structure of bipartite graphs. This benefits many applications where studying the cohesion in a graph shaped data is of particular interest. Examples include investigating the structure of computational graphs or input graphs to the algorithms, as well as dynamic phenomena and analytic tasks over complex real graphs. Butterfly counting is computationally expensive, and known techniques do not scale to large graphs; the problem is even harder in streaming graphs. In this paper, following a data-driven methodology, we first conduct an empirical analysis to uncover temporal organizing principles of butterflies in real streaming graphs and then we introduce an approximate adaptive window-based algorithm, sGrapp, for counting butterflies as well as its optimized version sGrapp-x. sGrapp is designed to operate efficiently and effectively over any graph stream with any temporal behavior. Experimental studies of sGrapp and sGrapp-x show superior performance in terms of both accuracy and efficiency.
翻译:我们研究蝴蝶的基本问题( 即 2,2- 线性) 在双边流图中计数。 类似单方图表中的三角, 计算蝴蝶是理解双方图形结构的关键。 这有益于在图形形状数据中研究凝固性的许多应用。 例如调查算法的计算图或输入图的结构, 以及复杂真实图表中的动态现象和分析任务。 蝴蝶计数是计算成本很高的, 已知技术不比大图表规模大; 流图中的问题甚至更为严重。 在本文中, 我们首先根据数据驱动的方法进行实验性分析, 找出真实流图中蝴蝶的时间组织原理, 然后我们推出一个基于窗口的适应性算法, sGrapp, 用于计算蝴蝶及其优化版 sGrapp-x。 sGrapp 设计的目的是为了以任何时间行为来高效和有效地运行任何图形流。 sGrapp 和 sgrapp-x 的精确性实验性表现优优。