对建议系统研究的可复制性和进展的粗略分析 (A Troubling Analysis of Reproducibility and Progress in Recommender Systems Research)

The design of algorithms that generate personalized ranked item lists is a central topic of research in the field of recommender systems. In the past few years, in particular, approaches based on deep learning (neural) techniques have become dominant in the literature. For all of them, substantial progress over the state-of-the-art is claimed. However, indications exist of certain problems in today's research practice, e.g., with respect to the choice and optimization of the baselines used for comparison, raising questions about the published claims. In order to obtain a better understanding of the actual progress, we have tried to reproduce recent results in the area of neural recommendation approaches based on collaborative filtering. The worrying outcome of the analysis of these recent works-all were published at prestigious scientific conferences between 2015 and 2018-is that 11 out of the 12 reproducible neural approaches can be outperformed by conceptually simple methods, e.g., based on the nearest-neighbor heuristics. None of the computationally complex neural methods was actually consistently better than already existing learning-based techniques, e.g., using matrix factorization or linear models. In our analysis, we discuss common issues in today's research practice, which, despite the many papers that are published on the topic, have apparently led the field to a certain level of stagnation.

翻译：设计能产生个性化排名项目列表的算法是推荐者系统领域研究的一个核心主题。特别是过去几年,基于深学习(神经)技术的方法在文献中占据主导地位。对于所有这些方法,都声称在最新工艺方面取得实质性进展。然而,在当今研究实践中存在某些问题,例如,在选择和优化用于比较的基线方面,提出了关于已公布主张的问题。为了更好地了解实际进展,我们试图在基于合作过滤的神经建议方法领域复制最近的结果。在2015年至2018年的著名科学会议上公布了这些近期工作分析的令人担忧的结果,12种可复制神经方法中,有11种可以用概念上简单的方法(例如,根据最近的邻里生物测算基准的选择和优化)来弥补。为了更好地了解实际进展,我们实际上没有一种计算复杂的神经方法比已经存在的基于学习的技术(例如,使用矩阵要素化或线性模型)领域分析结果都好得多,尽管我们已公布了一些共同的实地研究,但显然还是以某种实地研究为主的模型。