## 图机器学习 2.2-2.4 Properties of Networks, Random Graph

2020 年 3 月 28 日 图与推荐

特殊->一般->建立模型

【注意这里考虑的是无向图】我们用G_{np}来表示具有n个节点且每个边(u,v)都是服从概率p的独立同分布的无向图

也就是说图其实是随机过程的一个结果，下图是给定n,p值的一个例子

（1）度分布（degree distribution)

• P(k)：选定了一个节点，于是图中剩下n-1个节点，从剩下的节点中选取k个节点乘以这k个节点有k个边（也就是度为k）的概率
• 有了度分布，可以发现其是 二项式的
• 既然是二项式的我们就能得到其期望和方差

对应了大数定理

（2）聚合系数

（3）expansion

（4）largest connected component（最大连接元）

https://networkx.github.io/documentation/networkx-1.9/examples/drawing/giant_component.htmlnetworkx.github.io

• 之前说过度分布的直方图有利于判断图的结构，MSN和随机图的直方图差距还是很大的；
• 平均路径：MSN和随机图的数据差不多
• 平均聚合系数：差距很大，随机图的非常小
• 最大连接元：很接近

【高聚合系数反映了局部性】简单来说就是：社交网络中，我的朋友们互相也是认识的，如同下图

（1）聚合反应局部性（2）随机性可以使得“捷径“出现

STEP 1：选择高聚合的网络

STEP 2：对每个边，以概率p修改边的终点（引入捷径）

（1）高聚合系数：我的大部分朋友们互相都认识（2）低直径：但是我也有几个朋友在另一个城市，且他们和其他朋友互不认识

比如不同的friendship的建立往往基于相同的文化、相似的爱好等等

会发现得到的图邻近矩阵的结构是很规则的（因为邻近矩阵的元素都是1或者0），类似于具有规则结构的网格一样，这样的当然好（利于我们分析），可是这样的图结构就很难capture真实复杂网络的信息了，那么要怎么办？-----答案是引入随机性！

（1）初始矩阵的每个元素反应的是特定边出现的概率

（2）对初始矩阵进行Kronecker积运算，以获得较大的随机邻接矩阵，在该矩阵中，大型矩阵的每个元素数值再次给出了特定边出现在大图中的概率，这样的随机邻接矩阵定义了所有图的概率分布

### 更多

Pablo Barceló是智利天主教大学工程学院和数学学院数学与计算工程研究所所长，研究领域为数据库理论、计算机科学中的逻辑、自动机理论。

Modeling generative process of growing graphs has wide applications in social networks and recommendation systems, where cold start problem leads to new nodes isolated from existing graph. Despite the emerging literature in learning graph representation and graph generation, most of them can not handle isolated new nodes without nontrivial modifications. The challenge arises due to the fact that learning to generate representations for nodes in observed graph relies heavily on topological features, whereas for new nodes only node attributes are available. Here we propose a unified generative graph convolutional network that learns node representations for all nodes adaptively in a generative model framework, by sampling graph generation sequences constructed from observed graph data. We optimize over a variational lower bound that consists of a graph reconstruction term and an adaptive Kullback-Leibler divergence regularization term. We demonstrate the superior performance of our approach on several benchmark citation network datasets.

Text Classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification. However, only a limited number of studies have explored the more flexible graph convolutional neural networks (e.g., convolution on non-grid, e.g., arbitrary graph) for the task. In this work, we propose to use graph convolutional networks for text classification. We build a single text graph for a corpus based on word co-occurrence and document word relations, then learn a Text Graph Convolutional Network (Text GCN) for the corpus. Our Text GCN is initialized with one-hot representation for word and document, it then jointly learns the embeddings for both words and documents, as supervised by the known class labels for documents. Our experimental results on multiple benchmark datasets demonstrate that a vanilla Text GCN without any external word embeddings or knowledge outperforms state-of-the-art methods for text classification. On the other hand, Text GCN also learns predictive word and document embeddings. In addition, experimental results show that the improvement of Text GCN over state-of-the-art comparison methods become more prominent as we lower the percentage of training data, suggesting the robustness of Text GCN to less training data in text classification.

Meta-graph is currently the most powerful tool for similarity search on heterogeneous information networks,where a meta-graph is a composition of meta-paths that captures the complex structural information. However, current relevance computing based on meta-graph only considers the complex structural information, but ignores its embedded meta-paths information. To address this problem, we proposeMEta-GrAph-based network embedding models, called MEGA and MEGA++, respectively. The MEGA model uses normalized relevance or similarity measures that are derived from a meta-graph and its embedded meta-paths between nodes simultaneously, and then leverages tensor decomposition method to perform node embedding. The MEGA++ further facilitates the use of coupled tensor-matrix decomposition method to obtain a joint embedding for nodes, which simultaneously considers the hidden relations of all meta information of a meta-graph.Extensive experiments on two real datasets demonstrate thatMEGA and MEGA++ are more effective than state-of-the-art approaches.

Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods. It is unclear, however, whether they also have an ability to perform complex relational reasoning with the information they remember. Here, we first confirm our intuitions that standard memory architectures may struggle at tasks that heavily involve an understanding of the ways in which entities are connected -- i.e., tasks involving relational reasoning. We then improve upon these deficits by using a new memory module -- a \textit{Relational Memory Core} (RMC) -- which employs multi-head dot product attention to allow memories to interact. Finally, we test the RMC on a suite of tasks that may profit from more capable relational reasoning across sequential information, and show large gains in RL domains (e.g. Mini PacMan), program evaluation, and language modeling, achieving state-of-the-art results on the WikiText-103, Project Gutenberg, and GigaWord datasets.

We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.

Graph Convolutional Neural Networks (Graph CNNs) are generalizations of classical CNNs to handle graph data such as molecular data, point could and social networks. Current filters in graph CNNs are built for fixed and shared graph structure. However, for most real data, the graph structures varies in both size and connectivity. The paper proposes a generalized and flexible graph CNN taking data of arbitrary graph structure as input. In that way a task-driven adaptive graph is learned for each graph data while training. To efficiently learn the graph, a distance metric learning is proposed. Extensive experiments on nine graph-structured datasets have demonstrated the superior performance improvement on both convergence speed and predictive accuracy.

19+阅读 · 2020年2月14日

73+阅读 · 2019年11月27日

60+阅读 · 2019年11月25日

23+阅读 · 2019年8月13日

4+阅读 · 2019年3月22日

15+阅读 · 2017年12月15日

18+阅读 · 2017年11月9日

9+阅读 · 2017年10月7日

12+阅读 · 2017年9月2日

Hongwei Wang,Jure Leskovec
28+阅读 · 2020年2月17日
Simon S. Du,Kangcheng Hou,Barnabás Póczos,Ruslan Salakhutdinov,Ruosong Wang,Keyulu Xu
8+阅读 · 2019年11月4日
Da Xu,Chuanwei Ruan,Kamiya Motwani,Evren Korpeoglu,Sushant Kumar,Kannan Achan
3+阅读 · 2019年3月6日
Liang Yao,Chengsheng Mao,Yuan Luo
12+阅读 · 2018年9月15日
Lichao Sun,Lifang He,Zhipeng Huang,Bokai Cao,Congying Xia,Xiaokai Wei,Philip S. Yu
4+阅读 · 2018年9月11日
Adam Santoro,Ryan Faulkner,David Raposo,Jack Rae,Mike Chrzanowski,Theophane Weber,Daan Wierstra,Oriol Vinyals,Razvan Pascanu,Timothy Lillicrap
8+阅读 · 2018年6月28日
Avik Ray,Joe Neeman,Sujay Sanghavi,Sanjay Shakkottai
3+阅读 · 2018年2月24日
Petar Veličković,Guillem Cucurull,Arantxa Casanova,Adriana Romero,Pietro Liò,Yoshua Bengio
8+阅读 · 2018年2月4日
Duhyeon Bang,Hyunjung Shim
7+阅读 · 2018年1月28日
Ruoyu Li,Sheng Wang,Feiyun Zhu,Junzhou Huang
5+阅读 · 2018年1月10日
Top