We investigate a general matrix factorization for deviance-based losses, extending the ubiquitous singular value decomposition beyond squared error loss. While similar approaches have been explored before, here we propose an efficient algorithm that is flexible enough to allow for structural zeros and entry weights. Moreover, we provide theoretical support for these decompositions by (i) showing strong consistency under a generalized linear model setup, (ii) checking the adequacy of a chosen exponential family via a generalized Hosmer-Lemeshow test, and (iii) determining the rank of the decomposition via a maximum eigenvalue gap method. To further support our findings, we conduct simulation studies to assess robustness to decomposition assumptions and extensive case studies using benchmark datasets from image face recognition, natural language processing, network analysis, and biomedical studies. Our theoretical and empirical results indicate that the proposed decomposition is more flexible, general, and can provide improved performance when compared to traditional methods.
翻译:我们调查了基于偏差的损失的一般矩阵因子化,将无处不在的单值分解值扩大到平差错差损失之外。虽然以前曾探讨过类似的方法,但我们在此建议一种有效的算法,这种算法足够灵活,足以造成结构零和进位权重。此外,我们从理论上支持这些分解,办法是:(一) 在一般线性模型设置下表现出很强的一致性,(二) 通过普遍化的Hosmer-Lemeshow测试,检查所选择的指数型家庭是否充分,(三) 通过最大电子价值差距方法确定分解的等级。为了进一步支持我们的调查结果,我们进行模拟研究,利用图像表面识别、自然语言处理、网络分析以及生物医学研究的基准数据集,评估分解假设和广泛案例研究的稳健性。我们的理论和经验结果表明,拟议的分解法比较灵活、一般,而且与传统方法相比,能够提供更好的性能。