单机上学习质量图嵌入 (Learning Massive Graph Embeddings on a Single Machine) - 专知论文

会员服务 ·

0

图 · state-of-the-art · 学成 · 可辨认的 · 模型评估 ·

2021 年 1 月 20 日

Learning Massive Graph Embeddings on a Single Machine

翻译：单机上学习质量图嵌入

Jason Mohoney,Roger Waleffe,Yiheng Xu,Theodoros Rekatsinas,Shivaram Venkataraman

from arxiv, Under review

We propose a new framework for computing the embeddings of large-scale graphs on a single machine. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. We identify that current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement, which results in poor resource utilization and inefficient training. These limitations require state-of-the-art systems to distribute training across multiple machines. We propose Gaius, a system for efficient training of graph embeddings that leverages partition caching and buffer-aware data orderings to minimize disk access and interleaves data movement with computation to maximize utilization. We compare Gaius against two state-of-the-art industrial systems on a diverse array of benchmarks. We demonstrate that Gaius achieves the same level of accuracy but is up to one order-of magnitude faster. We also show that Gaius can scale training to datasets an order of magnitude beyond a single machine's GPU and CPU memory capacity, enabling training of configurations with more than a billion edges and 550GB of total parameters on a single AWS P3.2xLarge instance.

翻译：我们提出一个新的框架,用于计算在一台机器上嵌入的大型图形。一个图形嵌入是一个固定长度的矢量代表器,用于图形中的每个节点(和/或边缘类型),并已成为在图形上应用现代机器学习的离法方法。我们确定,目前用于学习大型图形嵌入的系统受到数据移动的瓶颈,这导致资源利用率低,培训效率低下。这些限制要求最先进的系统在多个机器中分配培训。我们建议盖尤斯,一个高效的图形嵌入培训系统,利用每个节点(和/或边缘类型)的固定长度矢量代表器,利用缓冲和缓冲观测数据命令,通过计算最大限度地减少磁盘访问和内叶数据移动,以最大限度地利用。我们比较盖尤斯与两个最先进的工业系统在一系列不同基准上的对比。我们证明盖尤斯达到同样的精确度,但达到一个层次的速度更快。我们还表明,盖尤斯可以将培训扩大到一个超出单一机器GPU和CPU-awareal 数据级范围,从而能够对五十五亿个PIS的组合进行比VVV的比强性强的多的配置。

0

相关内容

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

343+阅读 · 2020年1月27日

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

专知会员服务

85+阅读 · 2020年1月15日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【CMU】机器学习导论课程（Introduction to Machine Learning）

【CMU】机器学习导论课程（Introduction to Machine Learning）

专知会员服务

61+阅读 · 2019年8月26日

【斯坦福大学】《海量数据集挖掘》电子书及相关资源 603页pdf《Mining of Massive Datasets》

【斯坦福大学】《海量数据集挖掘》电子书及相关资源 603页pdf《Mining of Massive Datasets》

专知

5+阅读 · 2020年3月30日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知

76+阅读 · 2020年1月27日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

【资源推荐】Machine Learning基础学习资源汇总

【资源推荐】Machine Learning基础学习资源汇总

专知

30+阅读 · 2019年5月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs

Arxiv

0+阅读 · 2021年3月17日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Attentive Graph Neural Networks for Few-Shot Learning

Attentive Graph Neural Networks for Few-Shot Learning

Arxiv

40+阅读 · 2020年7月14日

All Word Embeddings from One Embedding

Arxiv

4+阅读 · 2020年5月25日

Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems

Arxiv

7+阅读 · 2020年3月12日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Learning Deep Transformer Models for Machine Translation

Learning Deep Transformer Models for Machine Translation

Arxiv

3+阅读 · 2019年6月5日

Binarized Knowledge Graph Embeddings

Arxiv

4+阅读 · 2019年2月8日

edge2vec: Learning Node Representation Using Edge Semantics

edge2vec: Learning Node Representation Using Edge Semantics

Arxiv

5+阅读 · 2018年9月7日

Scaling Neural Machine Translation

Arxiv

3+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知会员服务

343+阅读 · 2020年1月27日

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

专知会员服务

85+阅读 · 2020年1月15日

【大规模数据系统，552页ppt】Large-scale Data Systems

【大规模数据系统，552页ppt】Large-scale Data Systems

专知会员服务

61+阅读 · 2019年12月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【CMU】机器学习导论课程（Introduction to Machine Learning）

【CMU】机器学习导论课程（Introduction to Machine Learning）

专知会员服务

61+阅读 · 2019年8月26日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

【斯坦福大学】《海量数据集挖掘》电子书及相关资源 603页pdf《Mining of Massive Datasets》

【斯坦福大学】《海量数据集挖掘》电子书及相关资源 603页pdf《Mining of Massive Datasets》

专知

5+阅读 · 2020年3月30日

【2020新书】图机器学习，Graph-Powered Machine Learning

【2020新书】图机器学习，Graph-Powered Machine Learning

专知

76+阅读 · 2020年1月27日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

【资源推荐】Machine Learning基础学习资源汇总

【资源推荐】Machine Learning基础学习资源汇总

专知

30+阅读 · 2019年5月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs

Arxiv

0+阅读 · 2021年3月17日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Attentive Graph Neural Networks for Few-Shot Learning

Attentive Graph Neural Networks for Few-Shot Learning

Arxiv

40+阅读 · 2020年7月14日

All Word Embeddings from One Embedding

Arxiv

4+阅读 · 2020年5月25日

Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems

Arxiv

7+阅读 · 2020年3月12日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Learning Deep Transformer Models for Machine Translation

Learning Deep Transformer Models for Machine Translation

Arxiv

3+阅读 · 2019年6月5日

Binarized Knowledge Graph Embeddings

Arxiv

4+阅读 · 2019年2月8日

edge2vec: Learning Node Representation Using Edge Semantics

edge2vec: Learning Node Representation Using Edge Semantics

Arxiv

5+阅读 · 2018年9月7日

Scaling Neural Machine Translation

Arxiv

3+阅读 · 2018年6月1日

微信扫码咨询专知VIP会员