We study offline recommender learning from explicit rating feedback in the presence of selection bias. A current promising solution for the bias is the inverse propensity score (IPS) estimation. However, the performance of existing propensity-based methods can suffer significantly from the propensity estimation bias. In fact, most of the previous IPS-based methods require some amount of missing-completely-at-random (MCAR) data to accurately estimate the propensity. This leads to a critical self-contradiction; IPS is ineffective without MCAR data, even though it originally aims to learn recommenders from only missing-not-at-random feedback. To resolve this propensity contradiction, we derive a propensity-independent generalization error bound and propose a novel algorithm to minimize the theoretical bound via adversarial learning. Our theory and algorithm do not require a propensity estimation procedure, thereby leading to a well-performing rating predictor without the true propensity information. Extensive experiments demonstrate that the proposed approach is superior to a range of existing methods both in rating prediction and ranking metrics in practical settings without MCAR data.
翻译:在选择偏差的情况下,我们从明确的评级反馈中学习离线建议者。目前对偏差的有希望的解决方案是逆向偏差评分(IPS)估计。然而,基于偏差的现有方法的性能可能因偏差估计偏差而大受损害。事实上,以前基于IPS的多数前方法需要一定数量的缺失完全随机数据来准确估计偏差。这导致关键的自我调和;没有 MSCAR数据,IPS是无效的,即使最初的目的是只从缺失而非随机的反馈中学习推荐者。为了解决这种偏差的矛盾,我们形成了一种偏差自发的笼统误差,并提出一种新的算法,通过对抗性学习尽量减少理论约束。我们的理论和算法不需要偏差估计程序,从而在没有真正偏差信息的情况下导致业绩良好的评级预测。广泛的实验表明,拟议的方法优于在没有MCAR数据的实际环境中的评级预测和评级指标排序方面的各种现有方法。