在SemEval-2021任务2:Word2Vec和Lemma2Vec的阿拉伯语文字脱义性LU-BZU任务2:Word2Vec和Lemma2Vec表现 (LU-BZU at SemEval-2021 Task 2: Word2Vec and Lemma2Vec performance in Arabic Word-in-Context disambiguation) - 专知论文

会员服务 ·

0

Performer · 词向量表示 · MoDELS · 连续词袋模型 · Pair ·

2021 年 4 月 16 日

LU-BZU at SemEval-2021 Task 2: Word2Vec and Lemma2Vec performance in Arabic Word-in-Context disambiguation

翻译：在SemEval-2021任务2:Word2Vec和Lemma2Vec的阿拉伯语文字脱义性LU-BZU任务2:Word2Vec和Lemma2Vec表现

Moustafa Al-Hajj,Mustafa Jarrar

from arxiv, Accepted at SemEval 2021 Task 2, 8 Pages (6 Pages main content + 2 pages for references)

This paper presents a set of experiments to evaluate and compare between the performance of using CBOW Word2Vec and Lemma2Vec models for Arabic Word-in-Context (WiC) disambiguation without using sense inventories or sense embeddings. As part of the SemEval-2021 Shared Task 2 on WiC disambiguation, we used the dev.ar-ar dataset (2k sentence pairs) to decide whether two words in a given sentence pair carry the same meaning. We used two Word2Vec models: Wiki-CBOW, a pre-trained model on Arabic Wikipedia, and another model we trained on large Arabic corpora of about 3 billion tokens. Two Lemma2Vec models was also constructed based on the two Word2Vec models. Each of the four models was then used in the WiC disambiguation task, and then evaluated on the SemEval-2021 test.ar-ar dataset. At the end, we reported the performance of different models and compared between using lemma-based and word-based models.

翻译：本文介绍了一套实验,用以评价和比较使用CBOW Word2Vec和Lemma2Vec模型进行阿拉伯语Word-in-Context(WIC)脱混的功能,而没有使用感知盘点或感知嵌入器。作为SemEval-2021关于WAC脱混的共有任务2的一部分,我们使用dev.ar-ar数据集(2k句)来决定某一句中的两个词是否具有相同的含义。我们使用了两个Wiki-CBOW模型,即阿拉伯维基百科预先培训的模型,以及另一个我们培训的大约30亿个符号的大型阿拉伯公司模型。两个Lemma2Vec模型也是以两个Word2Vec模型为基础建造的。四个模型中的每一个后来都用于WIC脱混混工作,然后对SemEval-2021测试.ar-ar数据集进行了评估。在结尾,我们报告了不同模型的性能,并比较了使用伦玛模型和单词模型。

0

相关内容

Performer

【AAAI2021】长文本的上下文推理

专知会员服务

14+阅读 · 2021年1月18日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

【牛津DeepMind】从Word2Vec到BERT:上下文嵌入(Contextual Embeddings)综述论文

【牛津DeepMind】从Word2Vec到BERT:上下文嵌入(Contextual Embeddings)综述论文

专知会员服务

85+阅读 · 2020年3月18日

【Google-普林斯顿】从学习速率中解开自适应梯度法，Disentangling Adaptive Gradient

专知会员服务

19+阅读 · 2020年3月5日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP| 推荐文章】基于文本和知识库的语义搜索（Semantic search on text and knowledge bases）

专知会员服务

46+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

word2Vec总结

AINLP

3+阅读 · 2019年11月2日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新八篇情感分析相关论文—注意力网络、多模态情感分析、情感分析局限性、跨语言情感分类、多语言情感分析

【论文推荐】最新八篇情感分析相关论文—注意力网络、多模态情感分析、情感分析局限性、跨语言情感分类、多语言情感分析

专知

52+阅读 · 2018年6月28日

干货｜自然语言处理中的词向量 — word2vec！

干货｜自然语言处理中的词向量 — word2vec！

全球人工智能

7+阅读 · 2018年1月25日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

自然语言处理 (三)　之　word embedding

自然语言处理 (三)　之　word embedding

DeepLearning中文论坛

19+阅读 · 2015年8月3日

Probing Multilingual Language Models for Discourse

Arxiv

0+阅读 · 2021年6月9日

PALI at SemEval-2021 Task 2: Fine-Tune XLM-RoBERTa for Word in Context Disambiguation

Arxiv

0+阅读 · 2021年6月7日

MergeDistill: Merging Pre-trained Language Models using Distillation

Arxiv

0+阅读 · 2021年6月5日

cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction using Transformer-based Language Models pre-trained on various text corpora

Arxiv

0+阅读 · 2021年6月4日

All Word Embeddings from One Embedding

Arxiv

4+阅读 · 2020年5月25日

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

Arxiv

10+阅读 · 2018年12月11日

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Arxiv

7+阅读 · 2018年6月12日

Improving Sentiment Analysis in Arabic Using Word Representation

Arxiv

4+阅读 · 2018年2月28日

Topic Compositional Neural Language Model

Arxiv

5+阅读 · 2018年2月26日

Knowledge-based Word Sense Disambiguation using Topic Models

Arxiv

5+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

词向量表示

连续词袋模型

相关VIP内容

【AAAI2021】长文本的上下文推理

专知会员服务

14+阅读 · 2021年1月18日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

102+阅读 · 2020年4月25日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

【牛津DeepMind】从Word2Vec到BERT:上下文嵌入(Contextual Embeddings)综述论文

【牛津DeepMind】从Word2Vec到BERT:上下文嵌入(Contextual Embeddings)综述论文

专知会员服务

85+阅读 · 2020年3月18日

【Google-普林斯顿】从学习速率中解开自适应梯度法，Disentangling Adaptive Gradient

专知会员服务

19+阅读 · 2020年3月5日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【NLP| 推荐文章】基于文本和知识库的语义搜索（Semantic search on text and knowledge bases）

专知会员服务

46+阅读 · 2019年11月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

word2Vec总结

AINLP

3+阅读 · 2019年11月2日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新八篇情感分析相关论文—注意力网络、多模态情感分析、情感分析局限性、跨语言情感分类、多语言情感分析

【论文推荐】最新八篇情感分析相关论文—注意力网络、多模态情感分析、情感分析局限性、跨语言情感分类、多语言情感分析

专知

52+阅读 · 2018年6月28日

干货｜自然语言处理中的词向量 — word2vec！

干货｜自然语言处理中的词向量 — word2vec！

全球人工智能

7+阅读 · 2018年1月25日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

自然语言处理 (三)　之　word embedding

自然语言处理 (三)　之　word embedding

DeepLearning中文论坛

19+阅读 · 2015年8月3日

相关论文

Probing Multilingual Language Models for Discourse

Arxiv

0+阅读 · 2021年6月9日

PALI at SemEval-2021 Task 2: Fine-Tune XLM-RoBERTa for Word in Context Disambiguation

Arxiv

0+阅读 · 2021年6月7日

MergeDistill: Merging Pre-trained Language Models using Distillation

Arxiv

0+阅读 · 2021年6月5日

cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction using Transformer-based Language Models pre-trained on various text corpora

Arxiv

0+阅读 · 2021年6月4日

All Word Embeddings from One Embedding

Arxiv

4+阅读 · 2020年5月25日

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

Arxiv

10+阅读 · 2018年12月11日

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Arxiv

7+阅读 · 2018年6月12日

Improving Sentiment Analysis in Arabic Using Word Representation

Arxiv

4+阅读 · 2018年2月28日

Topic Compositional Neural Language Model

Arxiv

5+阅读 · 2018年2月26日

Knowledge-based Word Sense Disambiguation using Topic Models

Arxiv

5+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员