有些词比其他词更值钱吗? (Are Some Words Worth More than Others?) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 模型评估 · Performer · Perplexity ·

2020 年 10 月 14 日

Are Some Words Worth More than Others?

翻译：有些词比其他词更值钱吗?

Shiran Dudy,Steven Bedrick

from arxiv, EMNLP 2020 Eval4NLP Workshop

Current evaluation metrics for language modeling and generation rely heavily on the accuracy of predicted (or generated) words as compared to a reference ground truth. While important, token-level accuracy only captures one aspect of a language model's behavior, and ignores linguistic properties of words that may allow some mis-predicted tokens to be useful in practice. Furthermore, statistics directly tied to prediction accuracy (including perplexity) may be confounded by the Zipfian nature of written language, as the majority of the prediction attempts will occur with frequently-occurring types. A model's performance may vary greatly between high- and low-frequency words, which in practice could lead to failure modes such as repetitive and dull generated text being produced by a downstream consumer of a language model. To address this, we propose two new intrinsic evaluation measures within the framework of a simple word prediction task that are designed to give a more holistic picture of a language model's performance. We evaluate several commonly-used large English language models using our proposed metrics, and demonstrate that our approach reveals functional differences in performance between the models that are obscured by more traditional metrics.

翻译：语言建模和生成的现有评价指标在很大程度上依赖于预测(或生成)字词的准确性,而参照地的真象。重要、象征性的准确性只捕捉语言模型行为的一个方面,忽视了语言模型行为的语言特性,可能使某些错误预测的象征物在实践中有用。此外,与预测准确性直接相关的统计数据(包括复杂性),可能由书面语言的Zipfian性质所掩盖,因为大多数预测尝试将经常发生类型。一个模型的性能在高频和低频字眼之间可能有很大差异,实际上可能导致失败模式,例如由语言模型下游消费者制作的重复和枯燥的文字。为了解决这个问题,我们提议在简单字数预测任务框架内采取两项新的内在评价措施,目的是更全面地描述语言模型的性能。我们用拟议指标对几个常用的大型英语模型进行评价,并表明我们的方法揭示了模型在性能上的功能差异,这些模型被更传统的指标所掩盖。

0

相关内容

语言模型化

语言模型化

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【干货书】用于概率、统计和机器学习的Python，288页pdf

【干货书】用于概率、统计和机器学习的Python，288页pdf

专知会员服务

291+阅读 · 2020年6月3日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

118+阅读 · 2019年12月24日

【自然语言处理快速入门】《Natural Language Processing: A Crash Course!》by Shantanu Phadke

【自然语言处理快速入门】《Natural Language Processing: A Crash Course!》by Shantanu Phadke

专知会员服务

38+阅读 · 2019年11月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

Jointly Improving Summarization and Sentiment Classification

Jointly Improving Summarization and Sentiment Classification

黑龙江大学自然语言处理实验室

3+阅读 · 2018年6月12日

已删除

将门创投

4+阅读 · 2018年6月1日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

A Bayesian framework for patient-level partitioned survival cost-utility analysis

Arxiv

0+阅读 · 2020年11月21日

Topic modelling discourse dynamics in historical newspapers

Arxiv

0+阅读 · 2020年11月20日

Perceptual Evaluation of Liquid Simulation Methods

Arxiv

0+阅读 · 2020年11月20日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Neural Response Generation with Meta-Words

Neural Response Generation with Meta-Words

Arxiv

6+阅读 · 2019年6月14日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

Analysis Methods in Neural Language Processing: A Survey

Analysis Methods in Neural Language Processing: A Survey

Arxiv

4+阅读 · 2019年1月14日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【干货书】用于概率、统计和机器学习的Python，288页pdf

【干货书】用于概率、统计和机器学习的Python，288页pdf

专知会员服务

291+阅读 · 2020年6月3日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

118+阅读 · 2019年12月24日

【自然语言处理快速入门】《Natural Language Processing: A Crash Course!》by Shantanu Phadke

【自然语言处理快速入门】《Natural Language Processing: A Crash Course!》by Shantanu Phadke

专知会员服务

38+阅读 · 2019年11月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Facebook PyText 在 Github 上开源了

Facebook PyText 在 Github 上开源了

AINLP

7+阅读 · 2018年12月14日

Jointly Improving Summarization and Sentiment Classification

Jointly Improving Summarization and Sentiment Classification

黑龙江大学自然语言处理实验室

3+阅读 · 2018年6月12日

已删除

将门创投

4+阅读 · 2018年6月1日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

相关论文

A Bayesian framework for patient-level partitioned survival cost-utility analysis

Arxiv

0+阅读 · 2020年11月21日

Topic modelling discourse dynamics in historical newspapers

Arxiv

0+阅读 · 2020年11月20日

Perceptual Evaluation of Liquid Simulation Methods

Arxiv

0+阅读 · 2020年11月20日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

Language Models as Knowledge Bases?

Arxiv

6+阅读 · 2019年9月4日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

Neural Response Generation with Meta-Words

Neural Response Generation with Meta-Words

Arxiv

6+阅读 · 2019年6月14日

Interpretable machine learning: definitions, methods, and applications

Interpretable machine learning: definitions, methods, and applications

Arxiv

19+阅读 · 2019年1月14日

Analysis Methods in Neural Language Processing: A Survey

Analysis Methods in Neural Language Processing: A Survey

Arxiv

4+阅读 · 2019年1月14日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Arxiv

19+阅读 · 2018年5月25日

微信扫码咨询专知VIP会员