Graph neural networks offer a promising approach to supervised learning over graph data. Graph data, especially when it is privacy-sensitive or too large to train on centrally, is often stored partitioned across disparate processing units (clients) which want to minimize the communication costs during collaborative training. The fully-distributed setup takes such partitioning to its extreme, wherein features of only a single node and its adjacent edges are kept locally with one client processor. Existing GNNs are not architected for training in such setups and incur prohibitive costs therein. We propose RETEXO, a novel transformation of existing GNNs that improves the communication efficiency during training in the fully-distributed setup. We experimentally confirm that RETEXO offers up to 6 orders of magnitude better communication efficiency even when training shallow GNNs, with a minimal trade-off in accuracy for supervised node classification tasks.
翻译:图表神经网络为通过图表数据监督学习提供了一种很有希望的方法。图表数据,特别是当它对隐私敏感或过于庞大而无法集中培训时,往往被储存在不同的处理单位(客户)之间,这些处理单位(客户)希望在合作培训期间最大限度地减少通信费用。完全分布式的设置将这种分割走向极端,其中只有一个节点及其相邻边缘的特征与一个客户处理器保持本地状态。现有的GNN没有为这种设置的培训设计设计,因此产生令人望而却步的费用。我们提议对现有的GNN进行新的改造,提高在完全分布式设置培训期间的通信效率。我们实验性地确认,即使在培训浅端点及其相邻边缘时,RETEXO仍然提供多达6个质量更高的通信效率,而对于受监督的节点分类任务,其精确度则极低。</s>