This paper focuses on the discrimination capacity of aggregation functions: these are the permutation invariant functions used by graph neural networks to combine the features of nodes. Realizing that the most powerful aggregation functions suffer from a dimensionality curse, we consider a restricted setting. In particular, we show that the standard sum and a novel histogram-based function have the capacity to discriminate between any fixed number of inputs chosen by an adversary. Based on our insights, we design a graph neural network aiming, not to maximize discrimination capacity, but to learn discriminative graph representations that generalize well. Our empirical evaluation provides evidence that our choices can yield benefits to the problem of structural graph classification.

4
下载
关闭预览

相关内容

神经网络(Neural Networks)是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛,以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交,有助于全面的神经网络研究,从行为和大脑建模,学习算法,通过数学和计算分析,系统的工程和技术应用,大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流,并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此,神经网络编委会代表的专家领域包括心理学,神经生物学,计算机科学,工程,数学,物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学,神经科学,学习系统,数学和计算分析、工程和应用。 官网地址:http://dblp.uni-trier.de/db/journals/nn/

Existing knowledge distillation methods focus on convolutional neural networks~(CNNs), where the input samples like images lie in a grid domain, and have largely overlooked graph convolutional networks~(GCN) that handle non-grid data. In this paper, we propose to our best knowledge the first dedicated approach to {distilling} knowledge from a pre-trained GCN model. To enable the knowledge transfer from the teacher GCN to the student, we propose a local structure preserving module that explicitly accounts for the topological semantics of the teacher. In this module, the local structure information from both the teacher and the student are extracted as distributions, and hence minimizing the distance between these distributions enables topology-aware knowledge transfer from the teacher, yielding a compact yet high-performance student model. Moreover, the proposed approach is readily extendable to dynamic graph models, where the input graphs for the teacher and the student may differ. We evaluate the proposed method on two different datasets using GCN models of different architectures, and demonstrate that our method achieves the state-of-the-art knowledge distillation performance for GCN models.

0
19
下载
预览

Graph signals are signals with an irregular structure that can be described by a graph. Graph neural networks (GNNs) are information processing architectures tailored to these graph signals and made of stacked layers that compose graph convolutional filters with nonlinear activation functions. Graph convolutions endow GNNs with invariance to permutations of the graph nodes' labels. In this paper, we consider the design of trainable nonlinear activation functions that take into consideration the structure of the graph. This is accomplished by using graph median filters and graph max filters, which mimic linear graph convolutions and are shown to retain the permutation invariance of GNNs. We also discuss modifications to the backpropagation algorithm necessary to train local activation functions. The advantages of localized activation function architectures are demonstrated in four numerical experiments: source localization on synthetic graphs, authorship attribution of 19th century novels, movie recommender systems and scientific article classification. In all cases, localized activation functions are shown to improve model capacity.

0
4
下载
预览

In this paper, we address the hyperspectral image (HSI) classification task with a generative adversarial network and conditional random field (GAN-CRF) -based framework, which integrates a semi-supervised deep learning and a probabilistic graphical model, and make three contributions. First, we design four types of convolutional and transposed convolutional layers that consider the characteristics of HSIs to help with extracting discriminative features from limited numbers of labeled HSI samples. Second, we construct semi-supervised GANs to alleviate the shortage of training samples by adding labels to them and implicitly reconstructing real HSI data distribution through adversarial training. Third, we build dense conditional random fields (CRFs) on top of the random variables that are initialized to the softmax predictions of the trained GANs and are conditioned on HSIs to refine classification maps. This semi-supervised framework leverages the merits of discriminative and generative models through a game-theoretical approach. Moreover, even though we used very small numbers of labeled training HSI samples from the two most challenging and extensively studied datasets, the experimental results demonstrated that spectral-spatial GAN-CRF (SS-GAN-CRF) models achieved top-ranking accuracy for semi-supervised HSI classification.

0
3
下载
预览

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

0
16
下载
预览

Graph convolutional networks (GCNs) have been successfully applied in node classification tasks of network mining. However, most of these models based on neighborhood aggregation are usually shallow and lack the "graph pooling" mechanism, which prevents the model from obtaining adequate global information. In order to increase the receptive field, we propose a novel deep Hierarchical Graph Convolutional Network (H-GCN) for semi-supervised node classification. H-GCN first repeatedly aggregates structurally similar nodes to hyper-nodes and then refines the coarsened graph to the original to restore the representation for each node. Instead of merely aggregating one- or two-hop neighborhood information, the proposed coarsening procedure enlarges the receptive field for each node, hence more global information can be learned. Comprehensive experiments conducted on public datasets demonstrate the effectiveness of the proposed method over the state-of-art methods. Notably, our model gains substantial improvements when only a few labeled samples are provided.

0
13
下载
预览

Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations. GCNs derive inspiration primarily from recent deep learning approaches, and as a result, may inherit unnecessary complexity and redundant computation. In this paper, we reduce this excess complexity through successively removing nonlinearities and collapsing weight matrices between consecutive layers. We theoretically analyze the resulting linear model and show that it corresponds to a fixed low-pass filter followed by a linear classifier. Notably, our experimental evaluation demonstrates that these simplifications do not negatively impact accuracy in many downstream applications. Moreover, the resulting model scales to larger datasets, is naturally interpretable, and yields up to two orders of magnitude speedup over FastGCN.

0
11
下载
预览

Despite huge success in the image domain, modern detection models such as Faster R-CNN have not been used nearly as much for video analysis. This is arguably due to the fact that detection models are designed to operate on single frames and as a result do not have a mechanism for learning motion representations directly from video. We propose a learning procedure that allows detection models such as Faster R-CNN to learn motion features directly from the RGB video data while being optimized with respect to a pose estimation task. Given a pair of video frames---Frame A and Frame B---we force our model to predict human pose in Frame A using the features from Frame B. We do so by leveraging deformable convolutions across space and time. Our network learns to spatially sample features from Frame B in order to maximize pose detection accuracy in Frame A. This naturally encourages our network to learn motion offsets encoding the spatial correspondences between the two frames. We refer to these motion offsets as DiMoFs (Discriminative Motion Features). In our experiments we show that our training scheme helps learn effective motion cues, which can be used to estimate and localize salient human motion. Furthermore, we demonstrate that as a byproduct, our model also learns features that lead to improved pose detection in still-images, and better keypoint tracking. Finally, we show how to leverage our learned model for the tasks of spatiotemporal action localization and fine-grained action recognition.

0
3
下载
预览

Knowledge graphs are large graph-structured databases of facts, which typically suffer from incompleteness. Link prediction is the task of inferring missing relations (links) between entities (nodes) in a knowledge graph. We approach this task using a hypernetwork architecture to generate convolutional layer filters specific to each relation and apply those filters to the subject entity embeddings. This architecture enables a trade-off between non-linear expressiveness and the number of parameters to learn. Our model simplifies the entity and relation embedding interactions introduced by the predecessor convolutional model, while outperforming all previous approaches to link prediction across all standard link prediction datasets.

0
8
下载
预览

In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of securing word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder that is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validated our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task on standard datasets. These results are also shown to be statistically significant.

0
6
下载
预览

Despite of the success of Generative Adversarial Networks (GANs) for image generation tasks, the trade-off between image diversity and visual quality are an well-known issue. Conventional techniques achieve either visual quality or image diversity; the improvement in one side is often the result of sacrificing the degradation in the other side. In this paper, we aim to achieve both simultaneously by improving the stability of training GANs. A key idea of the proposed approach is to implicitly regularizing the discriminator using a representative feature. For that, this representative feature is extracted from the data distribution, and then transferred to the discriminator for enforcing slow updates of the gradient. Consequently, the entire training process is stabilized because the learning curve of discriminator varies slowly. Based on extensive evaluation, we demonstrate that our approach improves the visual quality and diversity of state-of-the art GANs.

0
7
下载
预览
小贴士
相关论文
Distillating Knowledge from Graph Convolutional Networks
Yiding Yang,Jiayan Qiu,Mingli Song,Dacheng Tao,Xinchao Wang
19+阅读 · 2020年3月23日
Invariance-Preserving Localized Activation Functions for Graph Neural Networks
Luana Ruiz,Fernando Gama,Antonio G. Marques,Alejandro Ribeiro
4+阅读 · 2019年11月5日
Zilong Zhong,Jonathan Li,David A. Clausi,Alexander Wong
3+阅读 · 2019年5月12日
Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He
16+阅读 · 2019年3月28日
Fenyu Hu,Yanqiao Zhu,Shu Wu,Liang Wang,Tieniu Tan
13+阅读 · 2019年3月5日
Simplifying Graph Convolutional Networks
Felix Wu,Tianyi Zhang,Amauri Holanda de Souza Jr.,Christopher Fifty,Tao Yu,Kilian Q. Weinberger
11+阅读 · 2019年2月19日
Learning Discriminative Motion Features Through Detection
Gedas Bertasius,Christoph Feichtenhofer,Du Tran,Jianbo Shi,Lorenzo Torresani
3+阅读 · 2018年12月11日
Ivana Balazevic,Carl Allen,Timothy M. Hospedales
8+阅读 · 2018年10月18日
Badri N. Patro,Vinod K. Kurmi,Sandeep Kumar,Vinay P. Namboodiri
6+阅读 · 2018年6月15日
Duhyeon Bang,Hyunjung Shim
7+阅读 · 2018年1月28日
相关VIP内容
相关资讯
Hierarchically Structured Meta-learning
CreateAMind
13+阅读 · 2019年5月22日
Transferring Knowledge across Learning Processes
CreateAMind
8+阅读 · 2019年5月18日
深度自进化聚类:Deep Self-Evolution Clustering
我爱读PAMI
11+阅读 · 2019年4月13日
无监督元学习表示学习
CreateAMind
21+阅读 · 2019年1月4日
Disentangled的假设的探讨
CreateAMind
8+阅读 · 2018年12月10日
disentangled-representation-papers
CreateAMind
24+阅读 · 2018年9月12日
条件GAN重大改进!cGANs with Projection Discriminator
CreateAMind
6+阅读 · 2018年2月7日
Capsule Networks解析
机器学习研究会
10+阅读 · 2017年11月12日
Auto-Encoding GAN
CreateAMind
5+阅读 · 2017年8月4日
Top