In recommender systems, users always choose favorite items to rate, which results in data missing not at random and poses a great challenge for unbiased evaluation and learning of prediction models. Currently, the doubly robust (DR) method and its variants have been widely studied and demonstrate superior performance. However, we show that DR methods are unstable to extremely small propensities and rely on extrapolations, resulting in sub-optimal performances. In this paper, we propose a stabilized doubly robust (SDR) estimator to address the above limitations while retaining double robustness. Theoretical analysis shows that SDR has bounded bias, variance and generalization error bound under inaccurate imputed errors and arbitrarily small propensities. In addition, we propose a novel learning approach for SDR that updates the imputation, propensity, and prediction models cyclically, achieving more stable and accurate predictions. Extensive experiments show that our approach significantly outperforms the existing methods.
翻译:在推荐系统中,用户总是选择最喜爱的评分项目,结果数据并非随机缺失,对公正评价和学习预测模型构成巨大挑战。目前,对双重强(DR)方法及其变体进行了广泛研究,并展示了优异性。然而,我们表明,DR方法不稳定,偏向极小,依赖外推法,导致次优性表现。在本文件中,我们提议一个稳定、双强(SDR)估计器,以解决上述限制,同时保持双强性。理论分析显示,SDR在不准确的估算错误和任意的偏小倾向下,将偏差、差异和一般化错误捆绑在一起。此外,我们提议了一个新的特别提款权学习方法,以周期方式更新估算、偏向和预测模型,实现更稳定和准确的预测。广泛的实验表明,我们的方法大大优于现有方法。