SuperSim:瑞典文词相似性和相关性测试集 (SuperSim: a test set for word similarity and relatedness in Swedish) - 专知论文

会员服务 ·

0

相似度 · 情景 · MoDELS · 语言模型化 · 基准 ·

2021 年 4 月 12 日

SuperSim: a test set for word similarity and relatedness in Swedish

翻译：SuperSim:瑞典文词相似性和相关性测试集

Simon Hengchen,Nina Tahmasebi

from arxiv, Accepted at NoDaLiDa 2021

Language models are notoriously difficult to evaluate. We release SuperSim, a large-scale similarity and relatedness test set for Swedish built with expert human judgments. The test set is composed of 1,360 word-pairs independently judged for both relatedness and similarity by five annotators. We evaluate three different models (Word2Vec, fastText, and GloVe) trained on two separate Swedish datasets, namely the Swedish Gigaword corpus and a Swedish Wikipedia dump, to provide a baseline for future comparison. We release the fully annotated test set, code, baseline models, and data.

翻译：语言模型很难评估,我们发行了SuperSim(SUPSSIM),这是瑞典人通过专家人类判断为瑞典人建立的大规模相似性和关联性测试。测试由5个注解者独立判断的1,360个单词和类似性组成。我们评估了三种不同的模型(Word2Vec、快图和GloVe),它们分别接受瑞典两个数据集的培训,即瑞典的Gigawoon 文集和瑞典的维基百科垃圾堆,为将来的比较提供基准。我们发布了一个完整的附加说明的测试集、代码、基线模型和数据。

0

相关内容

相似度

ICML 2021论文收录

ICML 2021论文收录

专知会员服务

123+阅读 · 2021年5月8日

百页Python编程指南

百页Python编程指南

专知会员服务

70+阅读 · 2021年2月16日

图像分割方法综述

图像分割方法综述

专知会员服务

56+阅读 · 2020年11月22日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

自然语言处理 (三)　之　word embedding

自然语言处理 (三)　之　word embedding

DeepLearning中文论坛

19+阅读 · 2015年8月3日

Unsupervised Learning of General-Purpose Embeddings for Code Changes

Arxiv

0+阅读 · 2021年6月3日

An Improved Baseline for Sentence-level Relation Extraction

Arxiv

1+阅读 · 2021年6月2日

Self-Supervised Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Self-Supervised Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Arxiv

0+阅读 · 2021年6月2日

Learning by Semantic Similarity Makes Abstractive Summarization Better

Arxiv

0+阅读 · 2021年6月2日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble

Arxiv

3+阅读 · 2019年10月17日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Learning Graph Embeddings from WordNet-based Similarity Measures

Learning Graph Embeddings from WordNet-based Similarity Measures

Arxiv

4+阅读 · 2018年8月16日

A Resource-Light Method for Cross-Lingual Semantic Textual Similarity

Arxiv

3+阅读 · 2018年1月19日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

ICML 2021论文收录

ICML 2021论文收录

专知会员服务

123+阅读 · 2021年5月8日

百页Python编程指南

百页Python编程指南

专知会员服务

70+阅读 · 2021年2月16日

图像分割方法综述

图像分割方法综述

专知会员服务

56+阅读 · 2020年11月22日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

【AAAI2020】多模态注意力语义图嵌入多标签分类（Cross-Modality Attention with Semantic Graph Embedding for Multi-Label Classification）

专知会员服务

92+阅读 · 2019年12月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【TED】生命中的每一年的智慧

【TED】生命中的每一年的智慧

英语演讲视频每日一推

10+阅读 · 2019年1月29日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

自然语言处理 (三)　之　word embedding

自然语言处理 (三)　之　word embedding

DeepLearning中文论坛

19+阅读 · 2015年8月3日

相关论文

Unsupervised Learning of General-Purpose Embeddings for Code Changes

Arxiv

0+阅读 · 2021年6月3日

An Improved Baseline for Sentence-level Relation Extraction

Arxiv

1+阅读 · 2021年6月2日

Self-Supervised Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Self-Supervised Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

Arxiv

0+阅读 · 2021年6月2日

Learning by Semantic Similarity Makes Abstractive Summarization Better

Arxiv

0+阅读 · 2021年6月2日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble

Arxiv

3+阅读 · 2019年10月17日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Learning Graph Embeddings from WordNet-based Similarity Measures

Learning Graph Embeddings from WordNet-based Similarity Measures

Arxiv

4+阅读 · 2018年8月16日

A Resource-Light Method for Cross-Lingual Semantic Textual Similarity

Arxiv

3+阅读 · 2018年1月19日

微信扫码咨询专知VIP会员