The recently proposed Collaborative Metric Learning (CML) paradigm has aroused wide interest in the area of recommendation systems (RS) owing to its simplicity and effectiveness. Typically, the existing literature of CML depends largely on the \textit{negative sampling} strategy to alleviate the time-consuming burden of pairwise computation. However, in this work, by taking a theoretical analysis, we find that negative sampling would lead to a biased estimation of the generalization error. Specifically, we show that the sampling-based CML would introduce a bias term in the generalization bound, which is quantified by the per-user \textit{Total Variance} (TV) between the distribution induced by negative sampling and the ground truth distribution. This suggests that optimizing the sampling-based CML loss function does not ensure a small generalization error even with sufficiently large training data. Moreover, we show that the bias term will vanish without the negative sampling strategy. Motivated by this, we propose an efficient alternative without negative sampling for CML named \textit{Sampling-Free Collaborative Metric Learning} (SFCML), to get rid of the sampling bias in a practical sense. Finally, comprehensive experiments over seven benchmark datasets speak to the superiority of the proposed algorithm.
翻译:最近提出的合作计量学习(CML)模式由于其简单性和有效性,在建议系统(RS)领域引起了广泛的兴趣。一般地说,CML的现有文献主要依赖\textit{负抽样}战略来减轻双向计算耗时的负担。然而,在这项工作中,通过进行理论分析,我们发现,负面抽样会导致对一般化错误的偏差估计。具体地说,我们表明,基于抽样的CML在一般化约束中将引入一个偏差术语,该术语由负抽样和地面真相分布所引发的分布(TV)量化。这表明,优化基于抽样的CML损失功能并不能确保一个小的概括错误,即使有足够大的培训数据。此外,我们表明,如果没有负面抽样战略,偏差术语将消失。我们为此提议了一个高效的替代方法,而不会对名为\textit{Samping-fretiveCompaticle Leg}的CML(SFCCCML)的CML)的CML进行负面抽样抽样测试,从而消除了实际意义上的数据上的优劣性。最后,我们证明。提出了一种有效的标准。