The increasingly widespread adoption of large language models has highlighted the need for improving their explainability. We present context length probing, a novel explanation technique for causal language models, based on tracking the predictions of a model as a function of the length of available context, and allowing to assign differential importance scores to different contexts. The technique is model-agnostic and does not rely on access to model internals beyond computing token-level probabilities. We apply context length probing to large pre-trained language models and offer some initial analyses and insights, including the potential for studying long-range dependencies. The source code and a demo of the method are available.
翻译:使用大型语言模型的情况日益普遍,这突出表明有必要改进其解释性,我们提出背景长度的计算方法,这是对因果语言模型的一种新解释技术,其依据是跟踪模型的预测,视现有背景的长短而定,并允许将不同的重要性分数分配给不同的情况。这种技术是模型的不可知性,不依赖在计算象征性概率之外使用模型的内部内涵。我们将背景长度的计算方法应用到大型预先培训的语言模型,并提供一些初步分析和洞察力,包括研究远距离依赖性的可能性。该方法的源代码和缩影是可用的。