User profile embedded in the prompt template of personalized recommendation agents play a crucial role in shaping their decision-making process. High-quality user profiles are essential for aligning agent behavior with real user interests. Typically, these profiles are constructed by leveraging LLMs for user profile modeling (LLM-UM). However, this process faces several challenges: (1) LLMs struggle with long user behaviors due to context length limitations and performance degradation. (2) Existing methods often extract only partial segments from full historical behavior sequence, inevitably discarding diverse user interests embedded in the omitted content, leading to incomplete modeling and suboptimal profiling. (3) User profiling is often tightly coupled with the inference context, requiring online processing, which introduces significant latency overhead. In this paper, we propose PersonaX, an agent-agnostic LLM-UM framework to address these challenges. It augments downstream recommendation agents to achieve better recommendation performance and inference efficiency. PersonaX (a) segments complete historical behaviors into clustered groups, (b) selects multiple sub behavior sequences (SBS) with a balance of prototypicality and diversity to form a high quality core set, (c) performs offline multi-persona profiling to capture diverse user interests and generate fine grained, cached textual personas, and (d) decouples user profiling from online inference, enabling profile retrieval instead of real time generation. Extensive experiments demonstrate its effectiveness: using only 30 to 50% of behavioral data (sequence length 480), PersonaX enhances AgentCF by 3 to 11% and Agent4Rec by 10 to 50%. As a scalable and model-agnostic LLM-UM solution, PersonaX sets a new benchmark in scalable user modeling.
翻译:个性化推荐智能体提示模板中嵌入的用户画像对其决策过程具有至关重要的塑造作用。高质量的画像对于使智能体行为与真实用户兴趣保持一致至关重要。通常,这些画像通过利用大语言模型进行用户画像建模来构建。然而,该过程面临若干挑战:(1) 由于上下文长度限制和性能下降,大语言模型难以处理长用户行为序列。(2) 现有方法通常仅从完整历史行为序列中提取部分片段,不可避免地丢弃了被省略内容中蕴含的多样化用户兴趣,导致建模不完整和画像效果欠佳。(3) 用户画像构建常与推理上下文紧密耦合,需要在线处理,这引入了显著的延迟开销。本文提出PersonaX,一个与智能体无关的大语言模型用户画像建模框架,以应对这些挑战。它通过增强下游推荐智能体来实现更好的推荐性能和推理效率。PersonaX (a) 将完整历史行为分割为聚类分组,(b) 选择多个兼具典型性和多样性的子行为序列以构成高质量核心集,(c) 执行离线多画像分析以捕捉多样化用户兴趣并生成细粒度的、可缓存的文本画像,(d) 将用户画像构建与在线推理解耦,实现画像检索而非实时生成。大量实验证明了其有效性:仅使用30%至50%的行为数据,PersonaX将AgentCF提升了3%至11%,将Agent4Rec提升了10%至50%。作为一个可扩展且与模型无关的大语言模型用户画像建模解决方案,PersonaX为可扩展用户建模设立了新基准。