The evaluation of recommender systems from a practical perspective is a topic of ongoing discourse within the research community. While many current evaluation methods reduce performance to a single value metric as an easy way to compare models, it relies on the assumption that the methods' performance remains constant over time. In this study, we examine this assumption and propose the Cross-Validation Thought Time (CVTT) technique as a more comprehensive evaluation method, focusing on model performance over time. By utilizing the proposed technique, we conduct an in-depth analysis of the performance of popular RecSys algorithms. Our findings indicate that (1) the performance of the recommenders varies over time for all reviewed datasets, (2) using simple evaluation approaches can lead to a substantial decrease in performance in real-world evaluation scenarios, and (3) excessive data usage can lead to suboptimal results.
翻译:从实际角度对推荐者系统进行评价是研究界持续讨论的一个专题,虽然许多现行评价方法将业绩降低到单一价值衡量标准,作为比较模型的简单方法,但所依据的假设是,这些方法的绩效在一段时间内保持不变。在本研究中,我们研究这一假设,并提议将交叉估价思考时间(CVTT)技术作为一种更全面的评价方法,侧重于模型在一段时间内的业绩。我们利用拟议的技术,对流行的RecSys算法的绩效进行深入分析。我们的调查结果表明:(1) 随着时间的推移,所有经审查的数据集的推荐者的业绩各不相同,(2) 使用简单的评价方法可以导致实际世界评价情景的绩效大幅下降,(3) 过度使用数据可能导致不理想的结果。