保护隐私在点击后转换预测方面的挑战和办法 (Challenges and approaches to privacy preserving post-click conversion prediction)

Online advertising has typically been more personalized than offline advertising, through the use of machine learning models and real-time auctions for ad targeting. One specific task, predicting the likelihood of conversion (i.e.\ the probability a user will purchase the advertised product), is crucial to the advertising ecosystem for both targeting and pricing ads. Currently, these models are often trained by observing individual user behavior, but, increasingly, regulatory and technical constraints are requiring privacy-preserving approaches. For example, major platforms are moving to restrict tracking individual user events across multiple applications, and governments around the world have shown steadily more interest in regulating the use of personal data. Instead of receiving data about individual user behavior, advertisers may receive privacy-preserving feedback, such as the number of installs of an advertised app that resulted from a group of users. In this paper we outline the recent privacy-related changes in the online advertising ecosystem from a machine learning perspective. We provide an overview of the challenges and constraints when learning conversion models in this setting. We introduce a novel approach for training these models that makes use of post-ranking signals. We show using offline experiments on real world data that it outperforms a model relying on opt-in data alone, and significantly reduces model degradation when no individual labels are available. Finally, we discuss future directions for research in this evolving area.

翻译：在线广告通常比离线广告更具个性化,通过使用机器学习模型和实时拍卖来进行广告定向。一项具体的任务,即预测转换的可能性(即用户购买广告产品的概率)对于广告生态系统的针对性和定价广告至关重要。目前,这些模型通常通过观察个人用户行为来培训,但越来越多的监管和技术限制要求采取隐私保护方法。例如,主要平台正在限制跟踪多个应用程序中的单个用户事件,而世界各国政府在监管个人数据的使用方面表现出了越来越多的兴趣。一项具体的任务,即预测转换的可能性(即用户购买广告产品的概率 ), 对于定位和定价广告广告对广告生态系统的定位至关重要。目前,这些模型往往通过观察个人用户的行为来培训,但是,监管和技术制约因素越来越要求采取隐私保护方法。例如,主要平台正在限制跟踪多个应用程序中的用户事件,而世界各国政府则越来越有兴趣监管个人数据的使用。我们用离线实验来了解个人用户行为,而不是接收隐私保护反馈,例如安装了一组用户制作的广告应用程序的数量。我们从机器学习的角度概述了网上广告生态系统中最近发生的与隐私有关的变化。我们概述了在学习这一环境下学习转换模型时遇到的挑战和制约。我们采用一种新的方法来培训这些模型来使用后级信号。我们用这些模型来显示这些模型使用后台信号。我们用在实际世界数据进行演示使用离线实验,在选择了自己在最后选择了自己选择了在选择了在最后的模型时,我们最后的模型。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【深度学习社区检测】Deep Learning for Community Detection: Progress, Challenges and Opportunities

专知会员服务

28+阅读 · 2020年6月13日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日