Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence implies that additional scaling could further expand the range of capabilities of language models.
翻译:扩大语言模式已证明可以预测地提高一系列下游任务的业绩和抽样效率。本文讨论了一个不可预测的现象,我们称之为大型语言模式的突发能力。我们认为,如果这种能力不存在于较小的模式中,而是存在于较大的模式中,那么这种能力就具有突发性。因此,不能仅仅通过推断较小模式的性能来预测新能力。这种出现的存在意味着进一步扩大规模可以进一步扩大语言模式的能力范围。