Various recent experimental results show that large language models (LLM) exhibit emergent abilities that are not present in small models. System performance is greatly improved after passing a certain critical threshold of scale. In this letter, we provide a simple explanation for such a phase transition phenomenon. For this, we model an LLM as a sequence-to-sequence random function. Instead of using instant generation at each step, we use a list decoder that keeps a list of candidate sequences at each step and defers the generation of the output sequence at the end. We show that there is a critical threshold such that the expected number of erroneous candidate sequences remains bounded when an LLM is below the threshold, and it grows exponentially when an LLM is above the threshold. Such a threshold is related to the basic reproduction number in a contagious disease.
翻译:近期的各种实验结果显示,大语言模型 (LLM) 出现了一些不在小模型中存在的新型特征。当模型达到一定的规模阈值后,其性能大幅提高。本文提供了这种相变现象的简单解释。为此,我们将 LLM 建模为一个序列到序列的随机函数。我们不使用即时生成方式,而是使用一个列表解码器,在每个步骤中保留候选序列列表,在结束时延迟生成输出序列。我们证明了存在一种临界阈值,当 LLM 在该阈值以下时,期望的候选序列数量保持有限,而当 LLM 在该阈值以上时,则呈指数增长。此阈值类似于传染病的基本繁殖数。