校正的 CBOW 表演以及跳过 (Corrected CBOW Performs as well as Skip-gram) - 专知论文

会员服务 ·

0

连续词袋模型 · Performer · 词向量表示 · Continuity · 负例 ·

2021 年 11 月 9 日

Corrected CBOW Performs as well as Skip-gram

翻译：校正的 CBOW 表演以及跳过

Ozan İrsoy,Adrian Benton,Karl Stratos

from arxiv, Presented at WINR at EMNLP 2021, added discussion about FastText, more discussion about findings, additional results on C4 data, wording changes

Mikolov et al. (2013a) observed that continuous bag-of-words (CBOW) word embeddings tend to underperform Skip-gram (SG) embeddings, and this finding has been reported in subsequent works. We find that these observations are driven not by fundamental differences in their training objectives, but more likely on faulty negative sampling CBOW implementations in popular libraries such as the official implementation, word2vec.c, and Gensim. We show that after correcting a bug in the CBOW gradient update, one can learn CBOW word embeddings that are fully competitive with SG on various intrinsic and extrinsic tasks, while being many times faster to train.

翻译：Mikolov等人(2013年a)观察到,连续一袋字嵌入(CBOW)的字嵌入往往表现不佳,而这一发现在随后的著作中已经得到报告,我们发现,这些观察并不是由于培训目标存在根本差异,而是由于在诸如正式实施、Word2vec.c和Gensim等流行图书馆中执行CBOW(CBOW)的错误抽样调查中出现错误差错。我们显示,在纠正CBOW梯度更新中的错误之后,人们可以学习CBOW的字嵌入,这些词嵌入与SG在各种内在和外部任务上具有充分竞争力,同时培训速度要快许多倍。

0

相关内容

连续词袋模型

连续词袋模型

连续词袋模型（CBOW），利用上下文或周围的单词来预测中心词。其输入为某一个特征词的上下文相关对应的词向量（单词的one-hot编码）；输出为这特定的一个词的词向量(单词的one-hot编码）。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI 2019 Tutorial】超越单词的神经向量表示:句子和文档嵌入（Neural Vector Representations beyond Words: Sentence and Document Embeddings），Gerard de Melo

【AAAI 2019 Tutorial】超越单词的神经向量表示:句子和文档嵌入（Neural Vector Representations beyond Words: Sentence and Document Embeddings），Gerard de Melo

专知会员服务

19+阅读 · 2019年11月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

词向量与ELMo模型

词向量与ELMo模型

AINLP

6+阅读 · 2020年3月16日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【干货】NLP中“词袋”模型和词嵌入模型的比较（附代码）

【干货】NLP中“词袋”模型和词嵌入模型的比较（附代码）

专知

11+阅读 · 2018年8月4日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

LibRec 每周算法：LDA主题模型

LibRec 每周算法：LDA主题模型

LibRec智能推荐

29+阅读 · 2017年12月4日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

字词的向量表示

字词的向量表示

黑龙江大学自然语言处理实验室

4+阅读 · 2016年6月13日

Explaining generalization in deep learning: progress and fundamental limits

Arxiv

10+阅读 · 2021年10月17日

On Disentangled Representations Learned From Correlated Data

Arxiv

8+阅读 · 2021年7月16日

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Arxiv

16+阅读 · 2021年5月2日

Pretrained Transformers Improve Out-of-Distribution Robustness

Arxiv

5+阅读 · 2020年4月13日

Simplifying Graph Convolutional Networks

Simplifying Graph Convolutional Networks

Arxiv

12+阅读 · 2019年2月19日

Incorporating Glosses into Neural Word Sense Disambiguation

Arxiv

4+阅读 · 2018年5月21日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

14+阅读 · 2018年5月19日

Learning Topic Models by Neighborhood Aggregation

Arxiv

3+阅读 · 2018年2月22日

Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec

Arxiv

17+阅读 · 2017年12月12日

Fast Linear Model for Knowledge Graph Embeddings

Arxiv

4+阅读 · 2017年10月30日

VIP会员

文章信息

相关主题

连续词袋模型

词向量表示

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI 2019 Tutorial】超越单词的神经向量表示:句子和文档嵌入（Neural Vector Representations beyond Words: Sentence and Document Embeddings），Gerard de Melo

【AAAI 2019 Tutorial】超越单词的神经向量表示:句子和文档嵌入（Neural Vector Representations beyond Words: Sentence and Document Embeddings），Gerard de Melo

专知会员服务

19+阅读 · 2019年11月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML2025】SADA：基于稳定性引导的自适应扩散加速方法

【ETZH博士论文】低维与高维空间中潜在表示的分析、建模与变换，169页pdf

车辆目标轨迹预测方法研究综述及展望

【ACL2025教程】LLM时代的合成数据，228页slides

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

词向量与ELMo模型

词向量与ELMo模型

AINLP

6+阅读 · 2020年3月16日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【干货】NLP中“词袋”模型和词嵌入模型的比较（附代码）

【干货】NLP中“词袋”模型和词嵌入模型的比较（附代码）

专知

11+阅读 · 2018年8月4日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

LibRec 每周算法：LDA主题模型

LibRec 每周算法：LDA主题模型

LibRec智能推荐

29+阅读 · 2017年12月4日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

字词的向量表示

字词的向量表示

黑龙江大学自然语言处理实验室

4+阅读 · 2016年6月13日

相关论文

Explaining generalization in deep learning: progress and fundamental limits

Arxiv

10+阅读 · 2021年10月17日

On Disentangled Representations Learned From Correlated Data

Arxiv

8+阅读 · 2021年7月16日

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Arxiv

16+阅读 · 2021年5月2日

Pretrained Transformers Improve Out-of-Distribution Robustness

Arxiv

5+阅读 · 2020年4月13日

Simplifying Graph Convolutional Networks

Simplifying Graph Convolutional Networks

Arxiv

12+阅读 · 2019年2月19日

Incorporating Glosses into Neural Word Sense Disambiguation

Arxiv

4+阅读 · 2018年5月21日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

14+阅读 · 2018年5月19日

Learning Topic Models by Neighborhood Aggregation

Arxiv

3+阅读 · 2018年2月22日

Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec

Arxiv

17+阅读 · 2017年12月12日

Fast Linear Model for Knowledge Graph Embeddings

Arxiv

4+阅读 · 2017年10月30日

微信扫码咨询专知VIP会员