We propose a new method for multivariate response regression and covariance estimation when elements of the response vector are of mixed types, for example some continuous and some discrete. Our method is based on a model which assumes the observable mixed-type response vector is connected to a latent multivariate normal response linear regression through a link function. We explore the properties of this model and show its parameters are identifiable under reasonable conditions. We impose no parametric restrictions on the covariance of the latent normal other than positive definiteness, thereby avoiding assumptions about unobservable variables which can be difficult to verify in practice. To accommodate this generality, we propose a novel algorithm for approximate maximum likelihood estimation that works "off-the-shelf" with many different combinations of response types, and which scales well in the dimension of the response vector. Our method typically gives better predictions and parameter estimates than fitting separate models for the different response types and allows for approximate likelihood ratio testing of relevant hypotheses such as independence of responses. The usefulness of the proposed method is illustrated in simulations; and one biomedical and one genomic data example.
翻译:当响应矢量的元素属于混合类型时,我们建议一种新的多变反应回归和共变估计方法,例如,某些连续的和某些离散的。我们的方法基于一种模型,假设可观测到的混合类型响应矢量通过链接函数与潜伏的多变正常反应线性回归相连。我们探索了该模型的特性,并展示了在合理条件下可以识别的参数。除了肯定性之外,我们对潜伏正常的共变数没有设置任何参数限制,从而避免对在实践中难以核实的不可观察变量的假设。为了适应这一普遍性,我们建议了一种新型算法,以大致估计最大可能性的算法,用多种不同的响应类型组合“现成”进行“现成”的估算,在响应矢量的维度方面,这种算法通常比为不同的响应类型设计单独的模型提供更好的预测和参数估计,并允许对相关假设(例如反应的独立性)进行大概的可能性比测试。在模拟中说明拟议方法的效用;以及一个生物医学和基因数据实例。