Semantic feature norms, lists of features that concepts do and do not possess, have played a central role in characterizing human conceptual knowledge, but require extensive human labor. Large language models (LLMs) offer a novel avenue for the automatic generation of such feature lists, but are prone to significant error. Here, we present a new method for combining a learned model of human lexical-semantics from limited data with LLM-generated data to efficiently generate high-quality feature norms.
翻译:语义特征规范是表征人类概念知识的重要依据,但需要耗费大量人力。大型语言模型(LLMs)为自动生成此类特征列表提供了新途径,但难免存在较大误差。本文提出了一种新方法,通过结合有限数据学得的人类词汇语义模型和LLM生成的数据,高效生成优质的特征规范。