Neural text decoding is important for generating high-quality texts using language models. To generate high-quality text, popular decoding algorithms like top-k, top-p (nucleus), and temperature-based sampling truncate or distort the unreliable low probability tail of the language model. Though these methods generate high-quality text after parameter tuning, they are ad hoc. Not much is known about the control they provide over the statistics of the output, which is important since recent reports show text quality is highest for a specific range of likelihoods. Here, first we provide a theoretical analysis of perplexity in top-k, top-p, and temperature sampling, finding that cross-entropy behaves approximately linearly as a function of p in top-p sampling whereas it is a nonlinear function of k in top-k sampling, under Zipfian statistics. We use this analysis to design a feedback-based adaptive top-k text decoding algorithm called mirostat that generates text (of any length) with a predetermined value of perplexity, and thereby high-quality text without any tuning. Experiments show that for low values of k and p in top-k and top-p sampling, perplexity drops significantly with generated text length, which is also correlated with excessive repetitions in the text (the boredom trap). On the other hand, for large values of k and p, we find that perplexity increases with generated text length, which is correlated with incoherence in the text (confusion trap). Mirostat avoids both traps: experiments show that cross-entropy has a near-linear relation with repetition in generated text. This relation is almost independent of the sampling method but slightly dependent on the model used. Hence, for a given language model, control over perplexity also gives control over repetitions. Experiments with human raters for fluency, coherence, and quality further verify our findings.
翻译:神经文本解码对于使用语言模型生成高质量文本非常重要。 为了生成高质量的文本, 使用语言模型生成高品质的文本、 流行的解码算法, 如顶部K、 顶部Pp( 核心核素), 以及基于温度的取样法, 或扭曲语言模型的不可靠的低概率尾部。 虽然这些方法在参数调整后产生高质量的文本, 但它们是临时性的。 我们使用这种分析来设计基于反馈的上部文本解码算法, 因为它在特定的可能性范围中显示文本质量最高。 首先, 我们提供对顶部K、 顶部P( 核心核素) 、 温度取样法和温度抽样算法的不直线性分析。 使用这种分析来设计一个基于反馈的上部文本解码算法, 以任何长度来生成文本( 直径直的), 并由此对上层文本进行高清晰度分析。 实验显示, 高端的正值在高端的正值中, 以高端的正值來显示 。