Building a scalable and real-time recommendation system is vital for many businesses driven by time-sensitive customer feedback, such as short-videos ranking or online ads. Despite the ubiquitous adoption of production-scale deep learning frameworks like TensorFlow or PyTorch, these general-purpose frameworks fall short of business demands in recommendation scenarios for various reasons: on one hand, tweaking systems based on static parameters and dense computations for recommendation with dynamic and sparse features is detrimental to model quality; on the other hand, such frameworks are designed with batch-training stage and serving stage completely separated, preventing the model from interacting with customer feedback in real-time. These issues led us to reexamine traditional approaches and explore radically different design choices. In this paper, we present Monolith, a system tailored for online training. Our design has been driven by observations of our application workloads and production environment that reflects a marked departure from other recommendations systems. Our contributions are manifold: first, we crafted a collisionless embedding table with optimizations such as expirable embeddings and frequency filtering to reduce its memory footprint; second, we provide an production-ready online training architecture with high fault-tolerance; finally, we proved that system reliability could be traded-off for real-time learning. Monolith has successfully landed in the BytePlus Recommend product.
翻译:建立可扩缩和实时建议系统对于许多企业来说至关重要,因为时间敏感的客户反馈,如短视频排名或在线广告等,对许多企业来说,建立可扩缩和实时建议系统至关重要。尽管普遍采用诸如TensorFlow或PyTorrch等生产规模深层次学习框架,但这些通用框架由于各种原因,没有达到建议设想中的业务需求:一方面,基于静态参数的微弱系统以及基于动态和稀疏特征的密集计算来提出建议,对模型质量有害;另一方面,这种框架是设计成批量培训阶段和服务阶段完全分离的,防止模型与客户实时反馈互动。这些问题导致我们重新审查传统方法并探索截然不同的设计选择。在本文件中,我们介绍一个专为在线培训而设计的系统“莫尼利特”。我们的设计是由对应用工作量和生产环境的观察驱动的,这反映了与其他建议系统的明显偏离。我们的贡献是多方面的:第一,我们设计了一个不碰撞的嵌嵌入表,其优化,例如快速嵌入和频过滤等,从而减少其记忆足迹。第二,我们重新审视了传统方法,并探索了传统方法,我们提供了一个用于在线培训的可靠度。