In many digital contexts such as online news and e-tailing with many new users and items, recommendation systems face several challenges: i) how to make initial recommendations to users with little or no response history (i.e., cold-start problem), ii) how to learn user preferences on items (test and learn), and iii) how to scale across many users and items with myriad demographics and attributes. While many recommendation systems accommodate aspects of these challenges, few if any address all. This paper introduces a Collaborative Filtering (CF) Multi-armed Bandit (B) with Attributes (A) recommendation system (CFB-A) to jointly accommodate all of these considerations. Empirical applications including an offline test on MovieLens data, synthetic data simulations, and an online grocery experiment indicate the CFB-A leads to substantial improvement on cumulative average rewards (e.g., total money or time spent, clicks, purchased quantities, average ratings, etc.) relative to the most powerful extant baseline methods.
翻译:在许多数字背景下,如在线新闻和电子与许多新用户和项目的联系,建议系统面临若干挑战:(一) 如何向很少或没有回应历史(即冷启动问题)的用户提出初步建议,(二) 如何学习用户对项目的偏好(测试和学习),(三) 如何跨越许多用户和具有多种人口和属性的项目,虽然许多建议系统顾及这些挑战的各个方面,但几乎没有任何地址。本文介绍一个具有属性(A)建议系统的协作过滤(CF)多臂盗匪系统(B),以联合容纳所有这些考虑因素。经验性应用,包括电影实验室数据离线测试、合成数据模拟和在线杂货试验,表明CFB-A导致相对于最强大的现有基线方法,累积平均奖励(如资金总额或时间、点击、购买的数量、平均评级等)的大幅改进。