We propose a new framework for computing the embeddings of large-scale graphs on a single machine. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. We identify that current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement, which results in poor resource utilization and inefficient training. These limitations require state-of-the-art systems to distribute training across multiple machines. We propose Gaius, a system for efficient training of graph embeddings that leverages partition caching and buffer-aware data orderings to minimize disk access and interleaves data movement with computation to maximize utilization. We compare Gaius against two state-of-the-art industrial systems on a diverse array of benchmarks. We demonstrate that Gaius achieves the same level of accuracy but is up to one order-of magnitude faster. We also show that Gaius can scale training to datasets an order of magnitude beyond a single machine's GPU and CPU memory capacity, enabling training of configurations with more than a billion edges and 550GB of total parameters on a single AWS P3.2xLarge instance.
翻译:我们提出一个新的框架,用于计算在一台机器上嵌入的大型图形。一个图形嵌入是一个固定长度的矢量代表器,用于图形中的每个节点(和/或边缘类型),并已成为在图形上应用现代机器学习的离法方法。我们确定,目前用于学习大型图形嵌入的系统受到数据移动的瓶颈,这导致资源利用率低,培训效率低下。这些限制要求最先进的系统在多个机器中分配培训。我们建议盖尤斯,一个高效的图形嵌入培训系统,利用每个节点(和/或边缘类型)的固定长度矢量代表器,利用缓冲和缓冲观测数据命令,通过计算最大限度地减少磁盘访问和内叶数据移动,以最大限度地利用。我们比较盖尤斯与两个最先进的工业系统在一系列不同基准上的对比。我们证明盖尤斯达到同样的精确度,但达到一个层次的速度更快。我们还表明,盖尤斯可以将培训扩大到一个超出单一机器GPU和CPU-awareal 数据级范围,从而能够对五十五亿个PIS的组合进行比VVV的比强性强的多的配置。