单语、多语和零弹射条件下数据集嵌入式的效果 (On the Effectiveness of Dataset Embeddings in Mono-lingual,Multi-lingual and Zero-shot Conditions) - 专知论文

会员服务 ·

0

Performer · INFORMS · 数据集 · contrastive · 测试数据 ·

2021 年 3 月 5 日

On the Effectiveness of Dataset Embeddings in Mono-lingual,Multi-lingual and Zero-shot Conditions

翻译：单语、多语和零弹射条件下数据集嵌入式的效果

Rob van der Goot,Ahmet Üstün,Barbara Plank

Recent complementary strands of research have shown that leveraging information on the data source through encoding their properties into embeddings can lead to performance increase when training a single model on heterogeneous data sources. However, it remains unclear in which situations these dataset embeddings are most effective, because they are used in a large variety of settings, languages and tasks. Furthermore, it is usually assumed that gold information on the data source is available, and that the test data is from a distribution seen during training. In this work, we compare the effect of dataset embeddings in mono-lingual settings, multi-lingual settings, and with predicted data source label in a zero-shot setting. We evaluate on three morphosyntactic tasks: morphological tagging, lemmatization, and dependency parsing, and use 104 datasets, 66 languages, and two different dataset grouping strategies. Performance increases are highest when the datasets are of the same language, and we know from which distribution the test-instance is drawn. In contrast, for setups where the data is from an unseen distribution, performance increase vanishes.

翻译：最近的补充研究显示,通过将数据源的属性编码成嵌入器来利用数据源的信息,在培训关于多元数据源的单一模型时,可以提高性能。然而,尚不清楚这些数据集嵌入器在哪些情况下最为有效,因为它们用于多种环境、语言和任务。此外,通常假定数据源的金信息是可得的,测试数据来自培训期间的分布。在这项工作中,我们比较了数据集嵌入单语设置、多语言设置和预测数据源标签在零分位设置中的效果。我们评估了三种形态合成任务:形态标记、列位化和依赖性分解,并使用104个数据集、66种语言和两种不同的数据集组合战略。当数据集是同一语言时,性能提高最高,我们知道从中绘制测试源。相比之下,在数据来自无形分布的设置方面,性能增加消失率。

0

相关内容

Performer

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【图机器学习论文】网络嵌入研究综述（A Survey on Network Embedding）

【图机器学习论文】网络嵌入研究综述（A Survey on Network Embedding）

专知会员服务

81+阅读 · 2019年12月16日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation

Arxiv

0+阅读 · 2021年4月28日

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

Arxiv

0+阅读 · 2021年4月28日

Leveraging Community and Author Context to Explain the Performance and Bias of Text-Based Deception Detection Models

Arxiv

0+阅读 · 2021年4月27日

A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification

Arxiv

6+阅读 · 2021年4月1日

OntoZSL: Ontology-enhanced Zero-shot Learning

Arxiv

17+阅读 · 2021年2月15日

Meta-GNN: On Few-shot Node Classification in Graph Meta-learning

Meta-GNN: On Few-shot Node Classification in Graph Meta-learning

Arxiv

5+阅读 · 2019年5月23日

Integrating Semantic Knowledge to Tackle Zero-shot Text Classification

Arxiv

6+阅读 · 2019年3月29日

Joint Embedding of Meta-Path and Meta-Graph for Heterogeneous Information Networks

Joint Embedding of Meta-Path and Meta-Graph for Heterogeneous Information Networks

Arxiv

4+阅读 · 2018年9月11日

Zero-Shot Object Detection

Zero-Shot Object Detection

Arxiv

9+阅读 · 2018年7月27日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

VIP会员

文章信息

相关主题

相关VIP内容

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

商业数据分析，39页ppt

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【图机器学习论文】网络嵌入研究综述（A Survey on Network Embedding）

【图机器学习论文】网络嵌入研究综述（A Survey on Network Embedding）

专知会员服务

81+阅读 · 2019年12月16日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation

Arxiv

0+阅读 · 2021年4月28日

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

Arxiv

0+阅读 · 2021年4月28日

Leveraging Community and Author Context to Explain the Performance and Bias of Text-Based Deception Detection Models

Arxiv

0+阅读 · 2021年4月27日

A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification

Arxiv

6+阅读 · 2021年4月1日

OntoZSL: Ontology-enhanced Zero-shot Learning

Arxiv

17+阅读 · 2021年2月15日

Meta-GNN: On Few-shot Node Classification in Graph Meta-learning

Meta-GNN: On Few-shot Node Classification in Graph Meta-learning

Arxiv

5+阅读 · 2019年5月23日

Integrating Semantic Knowledge to Tackle Zero-shot Text Classification

Arxiv

6+阅读 · 2019年3月29日

Joint Embedding of Meta-Path and Meta-Graph for Heterogeneous Information Networks

Joint Embedding of Meta-Path and Meta-Graph for Heterogeneous Information Networks

Arxiv

4+阅读 · 2018年9月11日

Zero-Shot Object Detection

Zero-Shot Object Detection

Arxiv

9+阅读 · 2018年7月27日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

微信扫码咨询专知VIP会员