Problem definition: Most of the display advertising inventory is sold through real-time auctions. The participants of these auctions are typically bidders (Google, Criteo, RTB House, Trade Desk for instance) who participate on behalf of advertisers. In order to estimate the value of each display opportunity, they usually train advanced machine learning algorithms using historical data. In the labeled training set, the inputs are vectors of features representing each display opportunity and the labels are the generated rewards. In practice, the rewards are given by the advertiser and are tied to whether or not a particular user converts. Consequently, the rewards are aggregated at the user level and never observed at the display level. A fundamental task that has, to the best of our knowledge, been overlooked is to account for this mismatch and split, or attribute, the rewards at the right granularity level before training a learning algorithm. We call this the label attribution problem. Methodology/results: In this paper, we develop an approach to the label attribution problem, which is both theoretically justified and practical. In particular, we develop a fixed point algorithm that allows for large scale implementation and showcase our solution using a large scale publicly available dataset from Criteo, a large Demand Side Platform. We dub our approach the Fixed Point Label Attribution (FiPLA) Algorithm. Managerial implications: There is often a hidden leap of faith when transforming the advertiser's signal into display labelling. DSP providers should be careful when building their machine learning pipeline and carefully solve the label attribution step.
翻译:暂无翻译