模型和不可确定性匹配的负样本 (Re-weighting Negative Samples for Model-Agnostic Matching)

Recommender Systems (RS), as an efficient tool to discover users' interested items from a very large corpus, has attracted more and more attention from academia and industry. As the initial stage of RS, large-scale matching is fundamental yet challenging. A typical recipe is to learn user and item representations with a two-tower architecture and then calculate the similarity score between both representation vectors, which however still struggles in how to properly deal with negative samples. In this paper, we find that the common practice that randomly sampling negative samples from the entire space and treating them equally is not an optimal choice, since the negative samples from different sub-spaces at different stages have different importance to a matching model. To address this issue, we propose a novel method named Unbiased Model-Agnostic Matching Approach (UMA$^2$). It consists of two basic modules including 1) General Matching Model (GMM), which is model-agnostic and can be implemented as any embedding-based two-tower models; and 2) Negative Samples Debias Network (NSDN), which discriminates negative samples by borrowing the idea of Inverse Propensity Weighting (IPW) and re-weighs the loss in GMM. UMA$^2$ seamlessly integrates these two modules in an end-to-end multi-task learning framework. Extensive experiments on both real-world offline dataset and online A/B test demonstrate its superiority over state-of-the-art methods.

翻译：推荐者系统(RS)是发现用户从一个非常庞大的体积中发现用户感兴趣的物品的有效工具,它吸引了学术界和行业越来越多的关注。在塞族共和国的初始阶段,大规模匹配是根本性的,但具有挑战性。一个典型的配方是学习用户和项目在二塔结构中的表达方式,然后计算两种代表矢量之间的相似性分数,尽管这两种矢量在如何正确处理负面样品方面仍然困难重重。在本文中,我们发现从整个空间抽取负面样品并同等对待这些样品的常见做法不是最佳选择,因为不同阶段的不同子空间的负面样品对匹配模式具有不同的重要性。为了解决这一问题,我们提出了一个叫作“不偏差模型和项目”的新方法。它由两个基本模块组成,其中包括:(1) 通用匹配模型(GMM),可以作为基于嵌入的二塔模型实施;(2) 负面样本(NSDN)网络(NSDN)不是最佳选择,因为通过借用真实的“透视值”的测试模型模型和“透度”模拟模型框架,将“透度-透度-透度-透视”的模型和“透度-透度-透度-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透式-透视-透视-透视-透式”系统-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透性-透性-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透视-透性-透式-透式-透式-透式-透式框架)。