Although large language models (LLMs) have demonstrated impressive potential on simple tasks, their breadth of scope, lack of transparency, and insufficient controllability can make them less effective when assisting humans on more complex tasks. In response, we introduce the concept of Chaining LLM steps together, where the output of one step becomes the input for the next, thus aggregating the gains per step. We first define a set of LLM primitive operations useful for Chain construction, then present an interactive system where users can modify these Chains, along with their intermediate results, in a modular way. In a 20-person user study, we found that Chaining not only improved the quality of task outcomes, but also significantly enhanced system transparency, controllability, and sense of collaboration. Additionally, we saw that users developed new ways of interacting with LLMs through Chains: they leveraged sub-tasks to calibrate model expectations, compared and contrasted alternative strategies by observing parallel downstream effects, and debugged unexpected model outputs by "unit-testing" sub-components of a Chain. In two case studies, we further explore how LLM Chains may be used in future applications.
翻译:尽管大型语言模型(LLMS)在简单的任务上表现出了令人印象深刻的潜力,但其范围广、缺乏透明度和控制能力不足,在协助人类完成更复杂的任务时,这些模型的效力会降低。作为回应,我们引入了LLM步骤捆绑概念,其中一步的输出成为下一个步骤的投入,从而将每一步的收益集中在一起。我们首先定义了一套LLM原始操作,这些操作对链的构建有用,然后提出了一个互动系统,用户可以在其中以模块方式修改这些链条及其中间结果。在一项20人用户研究中,我们发现链绑绑不仅提高了任务结果的质量,而且还大大加强了系统的透明度、可控性和协作感。此外,我们看到,用户制定了通过链条与LLMMs进行互动的新方法:他们利用子任务来调整模型期望,通过观察平行的下游效应和通过“单位测试”链条子组件解调出出出意外的模型产出。在两项案例研究中,我们进一步探索了LM链条在未来应用中如何使用。