Generative recommenders, typically transformer-based autoregressive models, predict the next item or action from a user's interaction history. Their effectiveness depends on how the model represents where an interaction event occurs in the sequence (discrete index) and when it occurred in wall-clock time. Prevailing approaches inject time via learned embeddings or relative attention biases. In this paper, we argue that RoPE-based approaches, if designed properly, can be a stronger alternative for jointly modeling temporal and sequential information in user behavior sequences. While vanilla RoPE in LLMs considers only token order, generative recommendation requires incorporating both event time and token index. To address this, we propose Time-and-Order RoPE (TO-RoPE), a family of rotary position embedding designs that treat index and time as angle sources shaping the query-key geometry directly. We present three instantiations: early fusion, split-by-dim, and split-by-head. Extensive experiments on both publicly available datasets and a proprietary industrial dataset show that TO-RoPE variants consistently improve accuracy over existing methods for encoding time and index. These results position rotary embeddings as a simple, principled, and deployment-friendly foundation for generative recommendation.
翻译:生成式推荐系统通常基于Transformer的自回归模型,通过用户交互历史预测下一个物品或行为。其有效性取决于模型如何表征交互事件在序列中的位置(离散索引)及其在真实时间轴上的发生时刻。主流方法通过可学习的嵌入或相对注意力偏置来注入时间信息。本文认为,若设计得当,基于RoPE的方法能够成为联合建模用户行为序列中时间与顺序信息的更优方案。尽管大语言模型中的原始RoPE仅考虑词元顺序,但生成式推荐需要同时融合事件时间和词元索引。为此,我们提出时间与顺序RoPE(TO-RoPE)——一类将索引和时间作为角度源直接塑造查询-键几何结构的旋转位置嵌入设计方案。我们呈现了三种具体实现:早期融合、按维度分割与按注意力头分割。在公开数据集与工业专有数据集上的大量实验表明,TO-RoPE变体在编码时间和索引的任务中均能持续提升现有方法的准确性。这些成果确立了旋转嵌入作为生成式推荐系统中简洁、原理清晰且易于部署的基础架构地位。