Factorization machines (FMs) are a powerful tool for regression and classification in the context of sparse observations, that has been successfully applied to collaborative filtering, especially when side information over users or items is available. Bayesian formulations of FMs have been proposed to provide confidence intervals over the predictions made by the model, however they usually involve Markov-chain Monte Carlo methods that require many samples to provide accurate predictions, resulting in slow training in the context of large-scale data. In this paper, we propose a variational formulation of factorization machines that allows us to derive a simple objective that can be easily optimized using standard mini-batch stochastic gradient descent, making it amenable to large-scale data. Our algorithm learns an approximate posterior distribution over the user and item parameters, which leads to confidence intervals over the predictions. We show, using several datasets, that it has comparable or better performance than existing methods in terms of prediction accuracy, and provide some applications in active learning strategies, e.g., preference elicitation techniques.
翻译:集成机(调频机)是在零星观测背景下进行回归和分类的有力工具,已成功地应用于合作过滤,特别是在用户或物品的侧边信息可供使用的情况下。 提议了巴伊西亚调频机配方,以提供模型预测时的信任间隔,但通常采用马克夫链蒙特卡洛方法,这些方法要求许多样本提供准确预测,导致大规模数据方面的培训缓慢。在本文中,我们提议了一种因子化机的变式配方,使我们能够得出一个简单的目标,这个目标可以很容易地利用标准的小型散货梯梯梯底优化,使之适合大规模的数据。我们的算法在用户和物品参数上学习了近似远地点分布,从而导致预测时的信任间隔。我们使用若干数据集表明,在预测准确性方面,它比现有方法具有可比或更好的性,并在积极的学习战略中提供一些应用,例如优惠引导技术。