Contextual bandit algorithms (CBAs) often rely on personal data to provide recommendations. This means that potentially sensitive data from past interactions are utilized to provide personalization to end-users. Using a local agent on the user's device protects the user's privacy, by keeping the data locally, however, the agent requires longer to produce useful recommendations, as it does not leverage feedback from other users. This paper proposes a technique we call Privacy-Preserving Bandits (P2B), a system that updates local agents by collecting feedback from other agents in a differentially-private manner. Comparisons of our proposed approach with a non-private, as well as a fully-private (local) system, show competitive performance on both synthetic benchmarks and real-world data. Specifically, we observed a decrease of 2.6% and 3.6% in multi-label classification accuracy, and a CTR increase of 0.0025 in online advertising for a privacy budget $\epsilon \approx$ 0.693. These results suggest P2B is an effective approach to problems arising in on-device privacy-preserving personalization.
翻译:环境土匪算法(CBAs)通常依赖个人数据来提供建议。这意味着利用过去互动中的潜在敏感数据来向最终用户提供个性化。在用户设备上使用当地代理保护用户隐私,在当地保留数据,但是,代理商需要更长的时间才能提出有用的建议,因为它无法利用其他用户的反馈。本文建议采用我们称之为“隐私-保护匪帮(P2B)”的技术,该系统以差别化的私人方式收集其他代理商的反馈,从而更新当地代理商的反馈。将我们提议的方法与非私营和完全私营(当地)系统进行比较,显示合成基准和实际世界数据的竞争性性能。具体地说,我们观察到多标签分类精确度下降了2.6%和3.6%,而隐私预算的在线广告费增加了0.0025美元 $\epsilon \ approx$ 0.693。这些结果表明,P2B是一种有效解决隐私保护个人化问题的方法。