您应该评估您的语言模式, 相对于象征值的边际可能性。 (You should evaluate your language model on marginal likelihood over tokenisations) - 专知论文

会员服务 ·

0

边缘似然函数 · 语言模型化 · 似然 · 边缘化 · Perplexity ·

2021 年 9 月 21 日

You should evaluate your language model on marginal likelihood over tokenisations

翻译：您应该评估您的语言模式, 相对于象征值的边际可能性。

Kris Cao,Laura Rimell

from arxiv, accepted at EMNLP 2021

Neural language models typically tokenise input text into sub-word units to achieve an open vocabulary. The standard approach is to use a single canonical tokenisation at both train and test time. We suggest that this approach is unsatisfactory and may bottleneck our evaluation of language model performance. Using only the one-best tokenisation ignores tokeniser uncertainty over alternative tokenisations, which may hurt model out-of-domain performance. In this paper, we argue that instead, language models should be evaluated on their marginal likelihood over tokenisations. We compare different estimators for the marginal likelihood based on sampling, and show that it is feasible to estimate the marginal likelihood with a manageable number of samples. We then evaluate pretrained English and German language models on both the one-best-tokenisation and marginal perplexities, and show that the marginal perplexity can be significantly better than the one best, especially on out-of-domain data. We link this difference in perplexity to the tokeniser uncertainty as measured by tokeniser entropy. We discuss some implications of our results for language model training and evaluation, particularly with regard to tokenisation robustness.

翻译：神经语言模型通常将输入文本象征性化到子词组中, 以获得开放词汇。标准方法是在火车和测试时间使用单一的卡通符号化。我们建议, 这种方法不令人满意, 可能会阻碍我们对语言模型性能的评估。仅使用一个最佳象征性化就忽略了代用符号的象征化不确定性, 这可能会伤害模型外外外的性能。在本文中, 我们争论说, 语言模型应该评估其相对于代用符号的边际可能性。我们比较了基于取样的不同估计者, 并表明以可控制数量的样本来估计边际可能性是可行的。我们然后对英语和德语的单一最佳化和边际曲解模式进行评估, 并表明边际的曲解可能比最佳的要好得多, 特别是外边际数据。我们把这一差异与代用代用符号英特罗普测量的象征性不确定性联系起来。我们讨论我们的结果对语言模型培训和评价的一些影响, 特别是象征性坚固度。

0

相关内容

边缘似然函数

边缘似然函数

ICML2021接受论文列表出炉！1184篇论文都在这了！

专知会员服务

92+阅读 · 2021年6月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

AAAI2020接受论文列表，1591篇论文目录全集

AAAI2020接受论文列表，1591篇论文目录全集

专知会员服务

99+阅读 · 2020年1月12日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

专知

23+阅读 · 2018年1月30日

论文笔记 | How NOT To Evaluate Your Dialogue System

论文笔记 | How NOT To Evaluate Your Dialogue System

科技创新与创业

13+阅读 · 2017年12月23日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

【音乐】Attention

【音乐】Attention

英语演讲视频每日一推

3+阅读 · 2017年8月22日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Competing Models

Competing Models

Arxiv

0+阅读 · 2021年11月11日

Pool samples to efficiently estimate pathogen prevalence dynamics

Arxiv

0+阅读 · 2021年11月11日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Arxiv

21+阅读 · 2021年9月2日

A Simple and Effective Model for Answering Multi-span Questions

Arxiv

3+阅读 · 2020年4月27日

You May Not Need Attention

Arxiv

4+阅读 · 2018年10月31日

Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

Arxiv

3+阅读 · 2018年9月15日

A Stochastic Decoder for Neural Machine Translation

Arxiv

5+阅读 · 2018年5月28日

Did the Model Understand the Question?

Arxiv

4+阅读 · 2018年5月14日

Neural Response Generation with Dynamic Vocabularies

Arxiv

5+阅读 · 2017年11月30日

VIP会员

文章信息

相关主题

边缘似然函数

语言模型化

相关VIP内容

ICML2021接受论文列表出炉！1184篇论文都在这了！

专知会员服务

92+阅读 · 2021年6月3日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

AAAI2020接受论文列表，1591篇论文目录全集

AAAI2020接受论文列表，1591篇论文目录全集

专知会员服务

99+阅读 · 2020年1月12日

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

【AAAI2020接受论文】预测性参与:开放领域对话系统自动评估的有效指标（Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems）

专知会员服务

14+阅读 · 2019年11月15日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《先进优化与机器学习技术赋能高效无线通信网络》最新225页

猛击OpenAI o1、DeepSeek-R1！阿里Qwen3登顶全球开源模型王座

美军最新条令《宪兵作战》208页

新型人工智能存储研究报告（2025年）

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

【论文推荐】最新5篇聊天机器人（Chatbot）相关论文—深度强化学习、社交聊天机器人小冰、对话聊天助手、序列-序列、动态词汇

专知

23+阅读 · 2018年1月30日

论文笔记 | How NOT To Evaluate Your Dialogue System

论文笔记 | How NOT To Evaluate Your Dialogue System

科技创新与创业

13+阅读 · 2017年12月23日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

【音乐】Attention

【音乐】Attention

英语演讲视频每日一推

3+阅读 · 2017年8月22日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Competing Models

Competing Models

Arxiv

0+阅读 · 2021年11月11日

Pool samples to efficiently estimate pathogen prevalence dynamics

Arxiv

0+阅读 · 2021年11月11日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Arxiv

21+阅读 · 2021年9月2日

A Simple and Effective Model for Answering Multi-span Questions

Arxiv

3+阅读 · 2020年4月27日

You May Not Need Attention

Arxiv

4+阅读 · 2018年10月31日

Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

Improving Natural Language Inference Using External Knowledge in the Science Questions Domain

Arxiv

3+阅读 · 2018年9月15日

A Stochastic Decoder for Neural Machine Translation

Arxiv

5+阅读 · 2018年5月28日

Did the Model Understand the Question?

Arxiv

4+阅读 · 2018年5月14日

Neural Response Generation with Dynamic Vocabularies

Arxiv

5+阅读 · 2017年11月30日

微信扫码咨询专知VIP会员