This paper calls attention to the missing component of the recommender system evaluation process: Statistical Inference. There is active research in several components of the recommender system evaluation process: selecting baselines, standardizing benchmarks, and target item sampling. However, there has not yet been significant work on the role and use of statistical inference for analyzing recommender system evaluation results. In this paper, we argue that the use of statistical inference is a key component of the evaluation process that has not been given sufficient attention. We support this argument with systematic review of recent RecSys papers to understand how statistical inference is currently being used, along with a brief survey of studies that have been done on the use of statistical inference in the information retrieval community. We present several challenges that exist for inference in recommendation experiment which buttresses the need for empirical studies to aid with appropriately selecting and applying statistical inference techniques.
翻译:本文提请注意建议者系统评价过程的缺失部分:统计推理;对建议者系统评价过程的几个组成部分进行积极研究:选择基线、标准化基准和目标项目抽样;然而,尚未就统计推理在分析建议者系统评价结果方面的作用和使用进行大量工作;在本文件中,我们争辩说,使用统计推理是评价过程的一个关键组成部分,但没有得到足够的重视;我们支持这一论点,对最近的RecSys文件进行系统审查,以了解统计推理目前是如何使用的,同时对信息检索界使用统计推理方法所作的研究进行简要调查;我们在建议试验中提出若干挑战,以推断是否有必要进行经验研究,以协助适当选择和应用统计推理技术。