In this paper, we proposed Speeder, a remarkably efficient paradigm to multimodal large language models for sequential recommendation. Speeder introduces 3 key components: (1) Multimodal Representation Compression (MRC), which efficiently reduces redundancy in item descriptions; (2) Sequential Position Awareness Enhancement (SPAE), which strengthens the model's ability to capture complex sequential dependencies; (3) Modality-aware Progressive Optimization (MPO), which progressively integrates different modalities to improve the model's understanding and reduce cognitive biases. Through extensive experiments, Speeder demonstrates superior performance over baselines in terms of VHR@1 and computational efficiency. Specifically, Speeder achieved 250% of the training speed and 400% of the inference speed compared to the state-of-the-art MLLM-based SR models. Future work could focus on incorporating real-time feedback from real-world systems.
翻译:本文提出了一种面向序列推荐的多模态大语言模型高效范式——Speeder。该范式包含三个核心组件:(1)多模态表征压缩(MRC),可有效降低项目描述中的冗余信息;(2)序列位置感知增强(SPAE),强化模型对复杂序列依赖关系的捕捉能力;(3)模态感知渐进优化(MPO),通过渐进式融合不同模态以提升模型理解能力并减少认知偏差。大量实验表明,Speeder在VHR@1指标与计算效率方面均显著优于基线模型。具体而言,相较于当前最先进的基于MLLM的序列推荐模型,Speeder实现了250%的训练速度提升与400%的推理速度提升。未来工作可聚焦于融合现实系统的实时反馈机制。