How do language models "think"? This paper formulates a probabilistic cognitive model called bounded pragmatic speaker, which can characterize the operation of different variants of language models. In particular, we show that large language models fine-tuned with reinforcement learning from human feedback (Ouyang et al., 2022) implements a model of thought that conceptually resembles a fast-and-slow model (Kahneman, 2011). We discuss the limitations of reinforcement learning from human feedback as a fast-and-slow model of thought and propose directions for extending this framework. Overall, our work demonstrates that viewing language models through the lens of cognitive probabilistic modeling can offer valuable insights for understanding, evaluating, and developing them.
翻译:暂无翻译