We introduce negative binomial matrix factorization (NBMF), a matrix factorization technique specially designed for analyzing over-dispersed count data. It can be viewed as an extension of Poisson matrix factorization (PF) perturbed by a multiplicative term which models exposure. This term brings a degree of freedom for controlling the dispersion, making NBMF more robust to outliers. We show that NBMF allows to skip traditional pre-processing stages, such as binarization, which lead to loss of information. Two estimation approaches are presented: maximum likelihood and variational Bayes inference. We test our model with a recommendation task and show its ability to predict user tastes with better precision than PF.
翻译:我们引入了负二进制矩阵因子化(NBMF),这是一种专门用于分析过度分散的计数数据的矩阵因子化技术,可视为Poisson矩阵因子化(PF)的延伸,被一个多倍化的术语所渗透。这个术语为控制分散提供了一定程度的自由,使NBMF对外部关系更加强大。我们表明,NBMF允许跳过传统的预处理阶段,如导致信息损失的二进制,这导致信息损失。我们提出了两种估算方法:最大可能性和变异性贝叶推断。我们用建议任务测试我们的模型,并显示其比PF更精确地预测用户口味的能力。