Large Language Models are affected by the phenomena of memorizing and forgetting their training data. But how do these vary by model size? We work towards this question by investigating how the model size affects the model's ability to discriminate a word's meaning in a given context. We introduce a dataset called DeltaWords, which evaluates a model's ability to follow instructions to select a sentence which replaces the target word with its antonym. We show a weak inverse scaling trend, where task accuracy degrades as model size increase, under extremely few-shot prompting regimes. We show that increasing the number of examples tend to disproportionately benefit larger models than smaller models.
翻译:大型语言模型受到记忆和忘记其培训数据现象的影响。 但是,这些模型如何因模型大小而异? 我们研究模型大小如何影响模型在特定情况下区分一个词的含义。 我们引入了一个称为DeltaWords的数据集,该数据集评估模型是否有能力遵循指示选择一个句子,用其反向缩放趋势,任务精确度随着模型大小的增加而下降,在极少发照的提示性制度下。 我们显示,增加例子的数量往往比较小的模型大得多。