锁定和变换: 学习大型词汇表的粗化嵌入 (Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies) - 专知论文

会员服务 ·

0

anchor · 词表 · 离散化 · 变换 · 稀疏 ·

2020 年 10 月 15 日

Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies

翻译：锁定和变换: 学习大型词汇表的粗化嵌入

Paul Pu Liang,Manzil Zaheer,Yuan Wang,Amr Ahmed

Learning continuous representations of discrete objects such as text, users, movies, and URLs lies at the heart of many applications including language and user modeling. When using discrete objects as input to neural networks, we often ignore the underlying structures (e.g. natural groupings and similarities) and embed the objects independently into individual vectors. As a result, existing methods do not scale to large vocabulary sizes. In this paper, we design a simple and efficient embedding algorithm that learns a small set of anchor embeddings and a sparse transformation matrix. We call our method Anchor & Transform (ANT) as the embeddings of discrete objects are a sparse linear combination of the anchors, weighted according to the transformation matrix. ANT is scalable, flexible, and end-to-end trainable. We further provide a statistical interpretation of our algorithm as a Bayesian nonparametric prior for embeddings that encourages sparsity and leverages natural groupings among objects. By deriving an approximate inference algorithm based on Small Variance Asymptotics, we obtain a natural extension that automatically learns the optimal number of anchors instead of having to tune it as a hyperparameter. On text classification, language modeling, and movie recommendation benchmarks, we show that ANT is particularly suitable for large vocabulary sizes and demonstrates stronger performance with fewer parameters (up to 40x compression) as compared to existing compression baselines.

翻译：文本、用户、电影和 URL 等离散对象的连续学习表现是许多应用程序的核心, 包括语言和用户模型。当使用离散对象作为神经网络的输入器时, 我们常常忽略基本结构( 如自然组合和相似性), 并将对象独立嵌入单个矢量。因此, 现有方法不比大词汇大小。在本文中, 我们设计一个简单有效的嵌入算法, 学习小套嵌套和稀薄的变换矩阵。我们称我们的方法“ 锁定和变换( ANT) ”, 因为离散对象的嵌入是按变换矩阵加权的细小线性组合。 NAT 是可伸缩的, 灵活和端到端到端到矢量的向量。我们进一步提供我们算法的统计解释, 因为它是一个不比大的词汇大小。通过基于小差异 Asty 的大致推导算法, 我们获得了自然扩展, 可以自动学习最优化的锚值组合组合组合组合, 而不是更精确的缩的缩缩缩的缩缩缩缩缩缩的缩缩的缩缩缩缩缩图。

0

相关内容

anchor

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

专知会员服务

45+阅读 · 2020年9月25日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

专知会员服务

85+阅读 · 2020年1月15日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

专知会员服务

40+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

已删除

将门创投

9+阅读 · 2017年7月28日

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

Arxiv

12+阅读 · 2020年4月15日

Relation Embedding with Dihedral Group in Knowledge Graph

Arxiv

9+阅读 · 2019年6月3日

DAG-GNN: DAG Structure Learning with Graph Neural Networks

Arxiv

8+阅读 · 2019年4月22日

Kernelized Hashcode Representations for Biomedical Relation Extraction

Kernelized Hashcode Representations for Biomedical Relation Extraction

Arxiv

4+阅读 · 2018年8月17日

Inductive Representation Learning on Large Graphs

Arxiv

3+阅读 · 2018年4月10日

Expeditious Generation of Knowledge Graph Embeddings

Arxiv

7+阅读 · 2018年3月21日

Attention-based Graph Neural Network for Semi-supervised Learning

Arxiv

3+阅读 · 2018年3月10日

GraphRNN: A Deep Generative Model for Graphs

Arxiv

6+阅读 · 2018年2月24日

Neural Response Generation with Dynamic Vocabularies

Arxiv

5+阅读 · 2017年11月30日

Fast Linear Model for Knowledge Graph Embeddings

Arxiv

4+阅读 · 2017年10月30日

VIP会员

文章信息

相关主题

相关VIP内容

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

专知会员服务

45+阅读 · 2020年9月25日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

专知会员服务

85+阅读 · 2020年1月15日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

基于图的word2vec负采样( GNEG:Graph-Based Negative Sampling for word2vec)

专知会员服务

40+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

已删除

将门创投

9+阅读 · 2017年7月28日

相关论文

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

Arxiv

12+阅读 · 2020年4月15日

Relation Embedding with Dihedral Group in Knowledge Graph

Arxiv

9+阅读 · 2019年6月3日

DAG-GNN: DAG Structure Learning with Graph Neural Networks

Arxiv

8+阅读 · 2019年4月22日

Kernelized Hashcode Representations for Biomedical Relation Extraction

Kernelized Hashcode Representations for Biomedical Relation Extraction

Arxiv

4+阅读 · 2018年8月17日

Inductive Representation Learning on Large Graphs

Arxiv

3+阅读 · 2018年4月10日

Expeditious Generation of Knowledge Graph Embeddings

Arxiv

7+阅读 · 2018年3月21日

Attention-based Graph Neural Network for Semi-supervised Learning

Arxiv

3+阅读 · 2018年3月10日

GraphRNN: A Deep Generative Model for Graphs

Arxiv

6+阅读 · 2018年2月24日

Neural Response Generation with Dynamic Vocabularies

Arxiv

5+阅读 · 2017年11月30日

Fast Linear Model for Knowledge Graph Embeddings

Arxiv

4+阅读 · 2017年10月30日

微信扫码咨询专知VIP会员