Over the past decades, recommendation has become a critical component of many online services such as media streaming and e-commerce. Recent advances in algorithms, evaluation methods and datasets have led to continuous improvements of the state-of-the-art. However, much work remains to be done to make these methods scale to the size of the internet. Online advertising offers a unique testbed for recommendation at scale. Every day, billions of users interact with millions of products in real-time. Systems addressing this scenario must work reliably at scale. We propose an efficient model (LED, for Lightweight Encoder-Decoder) reaching a new trade-off between complexity, scale and performance. Specifically, we show that combining large-scale matrix factorization with lightweight embedding fine-tuning unlocks state-of-the-art performance at scale. We further provide the detailed description of a system architecture and demonstrate its operation over two months at the scale of the internet. Our design allows serving billions of users across hundreds of millions of items in a few milliseconds using standard hardware.
翻译:在过去几十年中,建议已成为媒体流流和电子商务等许多在线服务的关键组成部分。最近在算法、评价方法和数据集方面的进步导致不断改进最新工艺。然而,要使这些方法的规模达到互联网的规模,仍有许多工作要做。在线广告提供了一个独特的建议测试台。每天,数十亿用户与成百万的实时产品进行实时互动。处理这一设想的系统必须大规模可靠地发挥作用。我们提议了一个高效模型(轻量 Encoder-Decoder )在复杂性、规模和性能之间实现新的平衡。具体地说,我们表明,大规模矩阵化与轻量化嵌入式微调的大规模锁定功能相结合。我们进一步详细描述一个系统架构,并展示其在两个月里在互联网规模上的运作。我们的设计允许在几毫秒内为数千万个用户提供服务,使用标准的硬件。