Modern recommender systems are trained to predict users potential future interactions from users historical behavior data. During the interaction process, despite the data coming from the user side recommender systems also generate exposure data to provide users with personalized recommendation slates. Compared with the sparse user behavior data, the system exposure data is much larger in volume since only very few exposed items would be clicked by the user. Besides, the users historical behavior data is privacy sensitive and is commonly protected with careful access authorization. However, the large volume of recommender exposure data usually receives less attention and could be accessed within a relatively larger scope of various information seekers. In this paper, we investigate the problem of user behavior leakage in recommender systems. We show that the privacy sensitive user past behavior data can be inferred through the modeling of system exposure. Besides, one can infer which items the user have clicked just from the observation of current system exposure for this user. Given the fact that system exposure data could be widely accessed from a relatively larger scope, we believe that the user past behavior privacy has a high risk of leakage in recommender systems. More precisely, we conduct an attack model whose input is the current recommended item slate (i.e., system exposure) for the user while the output is the user's historical behavior. Experimental results on two real-world datasets indicate a great danger of user behavior leakage. To address the risk, we propose a two-stage privacy-protection mechanism which firstly selects a subset of items from the exposure slate and then replaces the selected items with uniform or popularity-based exposure. Experimental evaluation reveals a trade-off effect between the recommendation accuracy and the privacy disclosure risk.
翻译:现代推荐人系统经过培训,可以预测用户从用户历史行为数据中今后的潜在互动。 在互动过程中,尽管用户侧推荐人系统提供了数据,但尽管用户侧推荐人系统提供了数据,也生成了接触数据,为用户提供了个性化建议板。与分散的用户行为数据相比,系统接触数据在数量上要大得多,因为用户只需点击很少的接触项目即可点击。此外,用户历史行为数据对隐私敏感,通常通过仔细访问授权来保护系统。然而,大量的建议接触数据通常不那么受到关注,而且可以在相对较大范围的各种信息搜索者中获取。在本文中,我们调查了用户行为在建议者系统中渗漏的问题。我们显示,通过系统曝光模型,可以推断对用户过去行为敏感的隐私敏感数据进行推断。此外,用户从当前系统对系统曝光的观察中点击的只是少量的。鉴于系统接触数据可以从较广的范围广泛访问,我们认为,用户过去的行为隐私在建议系统内有较高的渗漏风险风险。更准确地说,我们使用攻击模型,其输入的是当前用户的准确性风险披露系统,同时显示真实的用户风险披露系统。