Psychology research has long explored aspects of human personality such as extroversion, agreeableness and emotional stability. Categorizations like the `Big Five' personality traits are commonly used to assess and diagnose personality types. In this work, we explore the question of whether the perceived personality in language models is exhibited consistently in their language generation. For example, is a language model such as GPT2 likely to respond in a consistent way if asked to go out to a party? We also investigate whether such personality traits can be controlled. We show that when provided different types of contexts (such as personality descriptions, or answers to diagnostic questions about personality traits), language models such as BERT and GPT2 can consistently identify and reflect personality markers in those contexts. This behavior illustrates an ability to be manipulated in a highly predictable way, and frames them as tools for identifying personality traits and controlling personas in applications such as dialog systems. We also contribute a crowd-sourced data-set of personality descriptions of human subjects paired with their `Big Five' personality assessment data, and a data-set of personality descriptions collated from Reddit.
翻译:心理学研究长期以来就探索了人类个性的各个方面,如外向、可接受性和情感稳定等; 常用“五大”个性特征等分类来评估和诊断个性类型; 在这项工作中,我们探讨了语言模型中所认识的个性是否在其语言生成过程中得到一致表现的问题; 例如,GPT2这样的语言模型,如果被问到某一当事方,可能会以一致的方式作出反应? 我们还调查这种个性特征是否可以控制; 我们表明,如果提供不同类型的背景(如个性描述或对关于个性特征的诊断问题的答案),BERT和GPT2等语言模型可以始终如一地识别和反映这些情况下的个性标志; 这种行为表明能够以非常可预测的方式加以操纵,并将它们作为在诸如对话系统等应用中确定个性特征和控制人的工具。 我们还提供由人群提供的关于人类主体的个性描述的数据集,这些特征与其“五大个”个性评估数据相配配配配,以及从Reddididict 中核对的个性描述数据集。