The current success of Graph Neural Networks (GNNs) usually relies on loading the entire attributed graph for processing, which may not be satisfied with limited memory resources, especially when the attributed graph is large. This paper pioneers to propose a Binary Graph Convolutional Network (Bi-GCN), which binarizes both the network parameters and input node attributes and exploits binary operations instead of floating-point matrix multiplications for network compression and acceleration. Meanwhile, we also propose a new gradient approximation based back-propagation method to properly train our Bi-GCN. According to the theoretical analysis, our Bi-GCN can reduce the memory consumption by an average of ~31x for both the network parameters and input data, and accelerate the inference speed by an average of ~51x, on three citation networks, i.e., Cora, PubMed, and CiteSeer. Besides, we introduce a general approach to generalize our binarization method to other variants of GNNs, and achieve similar efficiencies. Although the proposed Bi-GCN and Bi-GNNs are simple yet efficient, these compressed networks may also possess a potential capacity problem, i.e., they may not have enough storage capacity to learn adequate representations for specific tasks. To tackle this capacity problem, an Entropy Cover Hypothesis is proposed to predict the lower bound of the width of Bi-GNN hidden layers. Extensive experiments have demonstrated that our Bi-GCN and Bi-GNNs can give comparable performances to the corresponding full-precision baselines on seven node classification datasets and verified the effectiveness of our Entropy Cover Hypothesis for solving the capacity problem.
翻译:平面神经网络(GNNS)目前的成功通常依赖于加载整个归正图来进行处理,这也许不能满足有限的记忆资源,特别是在被归正图很大的情况下。本文先行提出一个二进图革命网络(Bi-GCN),将网络参数和输入节点属性二进制,利用二进制操作,而不是浮动点矩阵乘法,进行网络压缩和加速。与此同时,我们还提议采用基于浮动点矩阵乘法的新的梯度近似回调法,以适当培训我们的Bi-GCN。根据理论分析,我们的Bi-GCN可以将网络参数和输入数据的内存耗减少平均为~31x,并将误差速度加快到平均~51x,在三个引用网络上,即科拉、普布迈德和CiteSeer。此外,我们还提出一种总体办法,将我们的分母化方法概括到其他变方的GNNPS,可以实现类似的效率。虽然拟议的Bi-CN和Bi-GNNNS的内基的内基值基线的内存率平均耗损消耗量平均为~31x,但这些网络可能具备一定的储存能力,但是它们也能够了解具体的内置的内置的储存能力。