Despite their remarkable capabilities, Large Language Models (LLMs) struggle to effectively leverage historical interaction information in dynamic and complex environments. Memory systems enable LLMs to move beyond stateless interactions by introducing persistent information storage, retrieval, and utilization mechanisms. However, existing memory systems often introduce substantial time and computational overhead. To this end, we introduce a new memory system called LightMem, which strikes a balance between the performance and efficiency of memory systems. Inspired by the Atkinson-Shiffrin model of human memory, LightMem organizes memory into three complementary stages. First, cognition-inspired sensory memory rapidly filters irrelevant information through lightweight compression and groups information according to their topics. Next, topic-aware short-term memory consolidates these topic-based groups, organizing and summarizing content for more structured access. Finally, long-term memory with sleep-time update employs an offline procedure that decouples consolidation from online inference. On LongMemEval and LoCoMo, using GPT and Qwen backbones, LightMem consistently surpasses strong baselines, improving QA accuracy by up to 7.7% / 29.3%, reducing total token usage by up to 38x / 20.9x and API calls by up to 30x / 55.5x, while purely online test-time costs are even lower, achieving up to 106x / 117x token reduction and 159x / 310x fewer API calls. The code is available at https://github.com/zjunlp/LightMem.
翻译:尽管大型语言模型(LLMs)展现出卓越能力,但在动态复杂环境中难以有效利用历史交互信息。记忆系统通过引入持久信息存储、检索与利用机制,使LLMs能够超越无状态交互。然而,现有记忆系统常伴随显著的时间与计算开销。为此,我们提出名为LightMem的新型记忆系统,在记忆系统性能与效率间取得平衡。受人类记忆的Atkinson-Shiffrin模型启发,LightMem将记忆组织为三个互补阶段:首先,认知启发的感官记忆通过轻量级压缩快速过滤无关信息,并按主题对信息进行分组;其次,主题感知的短期记忆整合这些主题分组,通过组织与摘要实现结构化访问;最后,采用睡眠时间更新的长期记忆通过离线处理实现整合过程与在线推理的解耦。在LongMemEval与LoCoMo基准测试中,基于GPT与Qwen骨干网络,LightMem持续超越强基线模型,将问答准确率最高提升7.7%/29.3%,总令牌使用量最高减少38倍/20.9倍,API调用次数最高降低30倍/55.5倍。纯在线测试时成本更低,实现最高106倍/117倍的令牌削减与159倍/310倍的API调用减少。代码已开源:https://github.com/zjunlp/LightMem。