Graph auto-encoders are widely used to construct graph representations in Euclidean vector spaces. However, it has already been pointed out empirically that linear models on many tasks can outperform graph auto-encoders. In our work, we prove that the solution space induced by graph auto-encoders is a subset of the solution space of a linear map. This demonstrates that linear embedding models have at least the representational power of graph auto-encoders based on graph convolutional networks. So why are we still using nonlinear graph auto-encoders? One reason could be that actively restricting the linear solution space might introduce an inductive bias that helps improve learning and generalization. While many researchers believe that the nonlinearity of the encoder is the critical ingredient towards this end, we instead identify the node features of the graph as a more powerful inductive bias. We give theoretical insights by introducing a corresponding bias in a linear model and analyzing the change in the solution space. Our experiments show that the linear encoder can outperform the nonlinear encoder when using feature information.
翻译:图形自动编码器被广泛用于在 Euclidean 矢量空间构建图形表达式。 但是, 已经从经验上指出, 许多任务的线性模型可以超越图形自动编码器。 在我们的工作中, 我们证明图形自动编码器引出的解析空间是线性地图解析空间的子集。 这表明线性嵌入模型至少具有基于图形共变网络的图形自动编码器的表示力。 因此, 为什么我们仍在使用非线性图形自动编码器? 一个原因可能是, 积极限制线性解决方案空间可能会引入一种诱导偏差, 从而帮助改进学习和概括。 虽然许多研究人员相信, 编码器的非线性是实现这一目标的关键成分, 但我们却把图形的节点特征确定为更强大的导导导偏。 我们通过在直线性模型中引入相应的偏差并分析解决方案空间的变化来给出理论洞察力。 我们的实验表明, 线性编码器在使用地貌信息时, 能够超越非线性编码器。