Simulating human profiles by instilling personas into large language models (LLMs) is rapidly transforming research in agentic behavioral simulation, LLM personalization, and human-AI alignment. However, most existing synthetic personas remain shallow and simplistic, capturing minimal attributes and failing to reflect the rich complexity and diversity of real human identities. We introduce DEEPPERSONA, a scalable generative engine for synthesizing narrative-complete synthetic personas through a two-stage, taxonomy-guided method. First, we algorithmically construct the largest-ever human-attribute taxonomy, comprising over hundreds of hierarchically organized attributes, by mining thousands of real user-ChatGPT conversations. Second, we progressively sample attributes from this taxonomy, conditionally generating coherent and realistic personas that average hundreds of structured attributes and roughly 1 MB of narrative text, two orders of magnitude deeper than prior works. Intrinsic evaluations confirm significant improvements in attribute diversity (32 percent higher coverage) and profile uniqueness (44 percent greater) compared to state-of-the-art baselines. Extrinsically, our personas enhance GPT-4.1-mini's personalized question answering accuracy by 11.6 percent on average across ten metrics and substantially narrow (by 31.7 percent) the gap between simulated LLM citizens and authentic human responses in social surveys. Our generated national citizens reduced the performance gap on the Big Five personality test by 17 percent relative to LLM-simulated citizens. DEEPPERSONA thus provides a rigorous, scalable, and privacy-free platform for high-fidelity human simulation and personalized AI research.
翻译:通过将人格注入大型语言模型(LLMs)来模拟人类档案,正在迅速改变智能体行为模拟、LLM个性化以及人机对齐领域的研究。然而,现有的大多数合成人格仍然浅显且简单,仅捕捉了极少属性,未能反映真实人类身份的丰富复杂性和多样性。我们提出了DEEPPERSONA,一种可扩展的生成引擎,通过一种两阶段、分类法引导的方法来合成叙事完整的合成人格。首先,我们通过挖掘数千个真实用户与ChatGPT的对话,算法化地构建了迄今为止最大的人类属性分类法,包含数百个层次化组织的属性。其次,我们逐步从该分类法中采样属性,有条件地生成连贯且逼真的人格,平均包含数百个结构化属性和约1 MB的叙事文本,其深度比先前工作高出两个数量级。内在评估证实,与最先进的基线相比,在属性多样性(覆盖率提高32%)和档案独特性(提升44%)方面均有显著改进。外在评估中,我们的人格将GPT-4.1-mini在个性化问答任务上的准确率平均提升了11.6%(基于十项指标),并在社会调查中大幅缩小(31.7%)了模拟LLM公民与真实人类响应之间的差距。我们生成的国民公民在“大五”人格测试上的表现差距,相对于LLM模拟的公民缩小了17%。因此,DEEPPERSONA为高保真人类模拟和个性化AI研究提供了一个严谨、可扩展且无需隐私数据的平台。