The aim of reduced rank regression is to connect multiple response variables to multiple predictors. This model is very popular, especially in biostatistics where multiple measurements on individuals can be re-used to predict multiple outputs. Unfortunately, there are often missing data in such datasets, making it difficult to use standard estimation tools. In this paper, we study the problem of reduced rank regression where the response matrix is incomplete. We propose a quasi-Bayesian approach to this problem, in the sense that the likelihood is replaced by a quasi-likelihood. We provide a tight oracle inequality, proving that our method is adaptive to the rank of the coefficient matrix. We describe a Langevin Monte Carlo algorithm for the computation of the posterior mean. Numerical comparison on synthetic and real data show that our method are competitive to the state-of-the-art where the rank is chosen by cross validation, and sometimes lead to an improvement.
翻译:降级回归的目的是将多个响应变量连接到多个预测器中。 这个模型非常流行, 特别是在生物统计学中, 个人多重测量可以被重新用于预测多重输出。 不幸的是, 这些数据集中往往缺少数据, 因而难以使用标准估算工具 。 在本文中, 我们研究降级回归问题, 反应矩阵不完整 。 我们建议了一种准巴伊西亚方法来解决这个问题, 即可能性被准相似值所取代 。 我们提供了一种紧紧的极小的不平等, 证明我们的方法是适应系数矩阵的等级 。 我们描述了用于计算后方值的Langevin Monte Carlo算法 。 合成和真实数据的数值比较表明, 我们的方法对通过交叉验证选择排名的状态具有竞争力, 有时还会导致改进 。