Prompted models have demonstrated impressive few-shot learning abilities. Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities. These compositions are probabilistic models, and may be expressed in the language of graphical models with random variables whose values are complex data types such as strings. Cases with control flow and dynamic structure require techniques from probabilistic programming, which allow implementing disparate model structures and inference strategies in a unified language. We formalize several existing techniques from this perspective, including scratchpads / chain of thought, verifiers, STaR, selection-inference, and tool use. We refer to the resulting programs as language model cascades.
翻译:快速模型展示了令人印象深刻的微小学习能力。 测试时与单一模型的反复互动,或多种模型的组合,进一步扩大了能力。这些构成是概率模型,可以用图形模型的语言表示,其值为随机变量,其值为字符串等复杂数据类型。 控制流动和动态结构的案例需要来自概率性编程的技术,从而可以用统一语言实施不同的模型结构和推论战略。 我们从这个角度正式确定了若干现有技术,包括刮痕/思维链、核查员、STaR、选择-推论和工具使用。 我们把由此产生的程序称为语言模型级联。