How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. We provide public access to 154 checkpoints for each one of the 16 models, alongside tools to download and reconstruct their exact training dataloaders for further study. We intend \textit{Pythia} to facilitate research in many areas, and we present several case studies including novel results in memorization, term frequency effects on few-shot performance, and reducing gender bias. We demonstrate that this highly controlled setup can be used to yield novel insights toward LLMs and their training dynamics. Trained models, analysis code, training code, and training data can be found at https://github.com/EleutherAI/pythia.
翻译:大型语言模型(LLMs)在训练过程中是如何发展和演变的?随着模型规模的扩大,这些模式会如何变化?为回答这些问题,我们推出了\textit{Pythia},一个包括16个LLMs的套件,这些模型都在相同的数据上按照相同的顺序进行训练,大小从70M到12B参数不等。我们提供了154个检查点的公共访问权限,逐个对16个模型进行了详细说明,并提供了工具以下载和重构模型的确切训练数据加载器,以供进一步研究。我们打算使用 \textit{Pythia} 促进许多领域的研究,并提出了几个案例研究,包括记忆力、词频对少样本性能的影响以及减轻性别偏见方面的新结果。我们证明这种高度受控制的设置可以用于产生关于LLMs和它们的训练动态的新见解。经过训练的模型、分析代码、训练代码和训练数据均可在https://github.com/EleutherAI/pythia找到。