VIP内容

推荐系统(RS)在信息搜索方面已经取得了巨大的成功。近年来,为了更好地拟合用户行为数据,人们进行了大量的推荐模型研究。然而,用户行为数据是观察性的,而不是实验性的。这使得各种偏差广泛存在于数据中,包括但不限于选择偏差、位置偏差、暴露偏差。盲目拟合数据而不考虑固有偏差会导致许多严重的问题,例如,离线评估和在线指标之间的差异,损害用户对推荐服务的满意度和信任等。为了将大量的研究模型转化为实际的改进,迫切需要探索偏差的影响,并在必要时制定去偏策略。因此,推荐系统中的偏倚问题及其解决方法引起了学术界和业界的高度关注。

在本教程中,我们旨在系统地回顾关于这个主题的现有工作。我们将介绍推荐系统中的七种偏见,以及它们的定义和特点;调研现有的去偏解决方案及其优缺点;并确定一些开放的挑战和未来的方向。我们希望本教程能激发更多关于这个主题的想法,并促进去偏推荐系统的开发。

https://recsys.acm.org/recsys21/tutorials/#content-tab-1-5-tab

成为VIP会员查看完整内容
0
22

最新论文

In this paper we present our 2nd place solution to ACM RecSys 2021 Challenge organized by Twitter. The challenge aims to predict user engagement for a set of tweets, offering an exceptionally large data set of 1 billion data points sampled from over four weeks of real Twitter interactions. Each data point contains multiple sources of information, such as tweet text along with engagement features, user features, and tweet features. The challenge brings the problem close to a real production environment by introducing strict latency constraints in the model evaluation phase: the average inference time for single tweet engagement prediction is limited to 6ms on a single CPU core with 64GB memory. Our proposed model relies on extensive feature engineering performed with methods such as the Efficient Manifold Density Estimator (EMDE) - our previously introduced algorithm based on Locality Sensitive Hashing method, and novel Fourier Feature Encoding, among others. In total, we create numerous features describing a user's Twitter account status and the content of a tweet. In order to adhere to the strict latency constraints, the underlying model is a simple residual feed-forward neural network. The system is a variation of our previous methods which proved successful in KDD Cup 2021, WSDM Challenge 2021, and SIGIR eCom Challenge 2020. We release the source code at: https://github.com/Synerise/recsys-challenge-2021

0
0
下载
预览
Top