As Large Language Models (LLMs) become integral to human-centered applications, understanding their personality-like behaviors is increasingly important for responsible development and deployment. This paper systematically evaluates six LLMs, applying the Big Five Inventory-2 (BFI-2) framework, to assess trait expressions under varying sampling temperatures. We find significant differences across four of the five personality dimensions, with Neuroticism and Extraversion susceptible to temperature adjustments. Further, hierarchical clustering reveals distinct model clusters, suggesting that architectural features may predispose certain models toward stable trait profiles. Taken together, these results offer new insights into the emergence of personality-like patterns in LLMs and provide a new perspective on model tuning, selection, and the ethical governance of AI systems. We share the data and code for this analysis here: https://osf.io/bsvzc/?view_only=6672219bede24b4e875097426dc3fac1
翻译:随着大语言模型(LLMs)日益融入以人为中心的应用,理解其类人格行为对于负责任地开发与部署至关重要。本研究系统评估了六种LLMs,应用大五人格量表-2(BFI-2)框架,以评估在不同采样温度下的人格特质表达。我们发现五个维度中有四个存在显著差异,其中神经质和外倾性特质易受温度调整影响。进一步地,层次聚类揭示了不同的模型聚类,表明架构特征可能使某些模型倾向于稳定的特质模式。综合来看,这些结果为LLMs中类人格模式的涌现提供了新见解,并为模型调优、选择及人工智能系统的伦理治理提供了新视角。本分析的数据与代码已在此共享:https://osf.io/bsvzc/?view_only=6672219bede24b4e875097426dc3fac1