We propose a new method for multivariate response regressions where the elements of the response vector can be of mixed types, for example some continuous and some discrete. Our method is based on a model which assumes the observable mixed-type response vector is connected to a latent multivariate normal response linear regression through a link function. We explore the properties of this model and show its parameters are identifiable under reasonable conditions. We impose no parametric restrictions on the covariance of the latent normal other than positive definiteness, thereby avoiding assumptions about unobservable variables which can be difficult to verify. To accommodate this generality, we propose a novel algorithm for approximate maximum likelihood estimation that works "off-the-shelf" with many different combinations of response types, and which scales well in the dimension of the response vector. Our method typically gives better predictions and parameter estimates than fitting separate models for the different response types and allows for approximate likelihood ratio testing of relevant hypotheses such as independence of responses. The usefulness of the proposed method is illustrated in simulations; and one biomedical and one genomic data example.
翻译:我们提出了一种新的多变反应回归法,其中反应矢量的元素可以是混合型的,例如某些连续的和某些离散的。我们的方法基于一种模型,假设可观测到的混合型反应矢量通过链接函数与潜伏的多变正常反应线性回归相连接。我们探索了该模型的特性,并表明其参数在合理条件下是可以识别的。除了肯定性之外,我们对潜在正常体的共差没有设置任何参数限制,从而避免对可能难以核实的不可观测变量的假设。为了适应这一普遍性,我们提出了一种新的算法,以估计“现成”与许多不同反应类型组合的近似最大可能性,以及反应矢量的大小。我们的方法通常比为不同反应类型设计单独的模型提供更好的预测和参数估计,并允许对反应独立性等相关假设进行大概的可能性比率测试。在模拟中说明拟议方法的效用;以及一个生物物理和基因组数据实例。