Recently online advertisers utilize Recommender systems (RSs) for display advertising to improve users' engagement. The contextual bandit model is a widely used RS to exploit and explore users' engagement and maximize the long-term rewards such as clicks or conversions. However, the current models aim to optimize a set of ads only in a specific domain and do not share information with other models in multiple domains. In this paper, we propose dynamic collaborative filtering Thompson Sampling (DCTS), the novel yet simple model to transfer knowledge among multiple bandit models. DCTS exploits similarities between users and between ads to estimate a prior distribution of Thompson sampling. Such similarities are obtained based on contextual features of users and ads. Similarities enable models in a domain that didn't have much data to converge more quickly by transferring knowledge. Moreover, DCTS incorporates temporal dynamics of users to track the user's recent change of preference. We first show transferring knowledge and incorporating temporal dynamics improve the performance of the baseline models on a synthetic dataset. Then we conduct an empirical analysis on a real-world dataset and the result showed that DCTS improves click-through rate by 9.7% than the state-of-the-art models. We also analyze hyper-parameters that adjust temporal dynamics and similarities and show the best parameter which maximizes CTR.
翻译:最近在线广告商使用建议系统(RSs)显示广告,以改善用户的接触。背景强盗模式是一个广泛使用的RS 模型,用于探索和探索用户的参与,并尽量扩大长期回报,如点击或转换。然而,当前模型的目的是在特定领域优化一套广告,不与多个领域的其他模式共享信息。在本文中,我们提议通过动态协作过滤Thompson抽样(DCTS),这是在多个强盗模型之间转让知识的新颖而简单的模式。DTS利用用户之间和广告之间的相似之处来估计汤普森抽样的先前分布。这些相似之处是根据用户和广告的背景特征取得的。相似之处使一个领域没有多少数据能够通过转让知识更快地汇集的模型。此外,DCTS包含用户的时间动态以跟踪用户最近的偏好变化。我们首先展示了知识的传输和时间动态,从而在合成数据集中改进了基线模型的性能。我们随后对真实世界数据集进行了实证分析,结果显示,DTS在用户和广告样本的先前分布率上都基于用户和广告的相貌特征特征特征特征特征特征。我们也展示了最高程度的模型。