制作统一发行的模型 (Modeling the Unigram Distribution) - 专知论文

会员服务 ·

0

一元语法 · MoDELS · 估计/估计量 · 有偏 · Better ·

2021 年 6 月 4 日

Modeling the Unigram Distribution

翻译：制作统一发行的模型

Irene Nikkarinen,Tiago Pimentel,Damián E. Blasi,Ryan Cotterell

from arxiv, Irene Nikkarinen and Tiago Pimentel contributed equally to this work. Accepted to the findings of ACL 2021. Code available in https://github.com/irenenikk/modelling-unigram

The unigram distribution is the non-contextual probability of finding a specific word form in a corpus. While of central importance to the study of language, it is commonly approximated by each word's sample frequency in the corpus. This approach, being highly dependent on sample size, assigns zero probability to any out-of-vocabulary (oov) word form. As a result, it produces negatively biased probabilities for any oov word form, while positively biased probabilities to in-corpus words. In this work, we argue in favor of properly modeling the unigram distribution -- claiming it should be a central task in natural language processing. With this in mind, we present a novel model for estimating it in a language (a neuralization of Goldwater et al.'s (2011) model) and show it produces much better estimates across a diverse set of 7 languages than the na\"ive use of neural character-level language models.

翻译：Unigram 分布方式是找到一个文体中特定单词形式的非逻辑概率。虽然对语言研究具有核心重要性, 但它通常被每个字在文体中的样本频率所近似。这种方法高度依赖样本大小, 将零概率赋予任何外词汇( oov) 单词形式。因此, 它为任何oov 单词形式产生负偏差概率, 同时对体内单词具有积极的偏向性。在这项工作中, 我们主张支持恰当地建模单词分布方式 -- 声称它在自然语言处理中应该是一项核心任务。有鉴于此, 我们提出了一个新颖的模式, 用来用一种语言来估计它( Goldwater et al. (2011年) 模式的神经化), 并显示它比“ 神经级语言模型的动态使用” 更好的估计方式在7种语言中产生更好的估计数。

0

相关内容

一元语法

【斯坦福CS330】终身学习: 问题陈述，前后迁移，30页ppt

【斯坦福CS330】终身学习: 问题陈述，前后迁移，30页ppt

专知会员服务

26+阅读 · 2020年12月13日

【EMNLP2020】序列知识蒸馏进展，44页ppt

【EMNLP2020】序列知识蒸馏进展，44页ppt

专知会员服务

39+阅读 · 2020年11月21日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

手写实现李航《统计学习方法》书中全部算法

专知会员服务

142+阅读 · 2020年5月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

干货 | 自然语言处理(2)之浅谈向量化与Hash-Trick

干货 | 自然语言处理(2)之浅谈向量化与Hash-Trick

机器学习算法与Python学习

3+阅读 · 2017年12月13日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Limit Distribution Theory for the Smooth 1-Wasserstein Distance with Applications

Arxiv

0+阅读 · 2021年7月28日

Diffusion Earth Mover's Distance and Distribution Embeddings

Arxiv

0+阅读 · 2021年7月27日

Bias in Zipf's Law Estimators

Arxiv

0+阅读 · 2021年7月26日

Supervised Tree-Wasserstein Distance

Arxiv

0+阅读 · 2021年7月23日

Distances between probability distributions of different dimensions

Arxiv

0+阅读 · 2021年7月23日

MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering

Arxiv

3+阅读 · 2020年9月18日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

Topic Compositional Neural Language Model

Arxiv

5+阅读 · 2018年2月26日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Arxiv

3+阅读 · 2018年1月30日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【斯坦福CS330】终身学习: 问题陈述，前后迁移，30页ppt

【斯坦福CS330】终身学习: 问题陈述，前后迁移，30页ppt

专知会员服务

26+阅读 · 2020年12月13日

【EMNLP2020】序列知识蒸馏进展，44页ppt

【EMNLP2020】序列知识蒸馏进展，44页ppt

专知会员服务

39+阅读 · 2020年11月21日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

手写实现李航《统计学习方法》书中全部算法

专知会员服务

142+阅读 · 2020年5月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

干货 | 自然语言处理(2)之浅谈向量化与Hash-Trick

干货 | 自然语言处理(2)之浅谈向量化与Hash-Trick

机器学习算法与Python学习

3+阅读 · 2017年12月13日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Limit Distribution Theory for the Smooth 1-Wasserstein Distance with Applications

Arxiv

0+阅读 · 2021年7月28日

Diffusion Earth Mover's Distance and Distribution Embeddings

Arxiv

0+阅读 · 2021年7月27日

Bias in Zipf's Law Estimators

Arxiv

0+阅读 · 2021年7月26日

Supervised Tree-Wasserstein Distance

Arxiv

0+阅读 · 2021年7月23日

Distances between probability distributions of different dimensions

Arxiv

0+阅读 · 2021年7月23日

MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering

Arxiv

3+阅读 · 2020年9月18日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

Topic Compositional Neural Language Model

Arxiv

5+阅读 · 2018年2月26日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Arxiv

3+阅读 · 2018年1月30日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

微信扫码咨询专知VIP会员