We propose a penalized likelihood method to fit the bivariate categorical response regression model. Our method allows practitioners to estimate which predictors are irrelevant, which predictors only affect the marginal distributions of the bivariate response, and which predictors affect both the marginal distributions and log odds ratios. To compute our estimator, we propose an efficient first order algorithm which we extend to settings where some subjects have only one response variable measured, i.e., the semi-supervised setting. We derive an asymptotic error bound which illustrates the performance of our estimator in high-dimensional settings. Generalizations to the multivariate categorical response regression model are proposed. Finally, simulation studies and an application in pan-cancer risk prediction demonstrate the usefulness of our method in terms of interpretability and prediction accuracy. An R package implementing the proposed method is available for download at github.com/ajmolstad/BvCategorical.
翻译:我们建议一种惩罚性的可能性方法来适应双轨绝对反应回归模型。我们的方法允许从业者估算哪些预测器无关紧要,哪些预测器只影响双轨反应的边缘分布,哪些预测器影响边际分布,哪些预测器影响边际分布和日志概率比。为了计算我们的估算器,我们建议了一种有效的第一顺序算法,我们将这一算法推广到某些对象只有一种反应变量被测量的设置,即半监督设置。我们得出了一种无症状错误,它显示了我们测算器在高维环境中的性能。提出了多变量绝对反应回归模型的概括性。最后,模拟研究和应用的锅癌风险预测表明我们的方法在可解释性和预测准确性方面是有用的。在 github.com/ajmolstad/BvCategoric上可以下载一个实施拟议方法的R包件。