Contextual bandit algorithms~(CBAs) often rely on personal data to provide recommendations. Centralized CBA agents utilize potentially sensitive data from recent interactions to provide personalization to end-users. Keeping the sensitive data locally, by running a local agent on the user's device, protects the user's privacy, however, the agent requires longer to produce useful recommendations, as it does not leverage feedback from other users. This paper proposes a technique we call Privacy-Preserving Bandits (P2B); a system that updates local agents by collecting feedback from other local agents in a differentially-private manner. Comparisons of our proposed approach with a non-private, as well as a fully-private (local) system, show competitive performance on both synthetic benchmarks and real-world data. Specifically, we observed only a decrease of 2.6% and 3.6% in multi-label classification accuracy, and a CTR increase of 0.0025 in online advertising for a privacy budget $\epsilon \approx 0.693$. These results suggest P2B is an effective approach to challenges arising in on-device privacy-preserving personalization.
翻译:中央化的CBA代理商利用最近互动中的潜在敏感数据,向最终用户提供个性化服务。 在当地保持敏感数据,在用户设备上运行一个当地代理商,保护用户的隐私,但代理商需要更长的时间才能提出有用的建议,因为它无法利用其他用户的反馈。本文建议采用我们称之为“隐私-保护匪帮”(P2B)的技术;一个以差别化的私人方式收集其他地方代理商反馈以更新当地代理商的系统。 将我们拟议办法与非私营和完全私营(地方)系统进行比较,显示合成基准和现实世界数据的竞争性业绩。具体地说,我们只观察到多标签分类准确性下降了2.6%和3.6%,而隐私预算的在线广告费增加了0.0025美元\epsilon\aprox 0.693美元。这些结果表明,P2B是一种有效方法,可以应对在保密性隐私保密个人化方面出现的挑战。