将不限名额语言生成的抽样定分系统化特征化 (A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation)

This work studies the widely adopted ancestral sampling algorithms for auto-regressive language models, which is not widely studied in the literature. We use the quality-diversity (Q-D) trade-off to investigate three popular sampling algorithms (top-k, nucleus and tempered sampling). We focus on the task of open-ended language generation. We first show that the existing sampling algorithms have similar performance. After carefully inspecting the transformations defined by different sampling algorithms, we identify three key properties that are shared among them: entropy reduction, order preservation, and slope preservation. To validate the importance of the identified properties, we design two sets of new sampling algorithms: one set in which each algorithm satisfies all three properties, and one set in which each algorithm violates at least one of the properties. We compare their performance with existing sampling algorithms, and find that violating the identified properties could lead to drastic performance degradation, as measured by the Q-D trade-off. On the other hand, we find that the set of sampling algorithms that satisfies these properties performs on par with the existing sampling algorithms. Our data and code are available at https://github.com/moinnadeem/characterizing-sampling-algorithms

翻译：这项工作研究广泛采用的自递递减语言模型的祖传抽样算法,文献对此没有进行广泛研究。我们使用质量多样性(Q-D)交换法调查三种流行的抽样算法(高空、核心和温带抽样)。我们集中研究开放语言一代的任务。我们首先显示,现有的抽样算法具有类似的性能。在仔细检查不同抽样算法界定的变换后,我们确定了它们之间共有的三种关键属性:消化变换、秩序保存和斜坡保存。为了确认所查明的特性的重要性,我们设计了两套新的抽样算法:一套是每个算法都满足所有三种特性的一套,另一套是每个算法至少侵犯其中一种特性的一套。我们将其性能与现有的抽样算法进行比较,并发现违反所查明的特性可能导致性能急剧退化,这是由Q-D交易算法所测量的。另一方面,我们发现符合这些特性的一组抽样算法与现有抽样算法相同。我们的数据和代码可以在 https://giththubal/moinactasimade.com/moactalizalmalition上查阅。我们的数据和代码。