This work is motivated by the need to accurately model a vector of responses related to pediatric functional status using administrative health data from inpatient rehabilitation visits. The components of the responses have known and structured interrelationships. To make use of these relationships in modeling, we develop a two-pronged regularization approach to borrow information across the responses. The first component of our approach encourages joint selection of the effects of each variable across possibly overlapping groups related responses and the second component encourages shrinkage of effects towards each other for related responses. As the responses in our motivating study are not normally-distributed, our approach does not rely on an assumption of multivariate normality of the responses. We show that with an adaptive version of our penalty, our approach results in the same asymptotic distribution of estimates as if we had known in advance which variables were non-zero and which variables have the same effects across some outcomes. We demonstrate the performance of our method in extensive numerical studies and in an application in the prediction of functional status of pediatric patients using administrative health data in a population of children with neurological injury or illness at a large children's hospital.
翻译:这项工作的动机是,需要利用住院康复访问的行政保健数据,准确模拟与儿科功能状态有关的应对措施的矢量。答复的各组成部分已经知道并结构化了相互关系。为了在建模中利用这些关系,我们制定了一种双管齐下的正规化办法,以在答复中借阅信息。我们的方法的第一部分鼓励在可能重叠的各组相关应对措施中联合选择每个变量的影响,第二部分则鼓励相关应对措施相互缩小影响。由于我们激励性研究中的答复通常不是分散的,因此我们的方法并不依赖于对答复的多变性常态的假设。我们用适应性版本的处罚来显示,我们的方法的结果与我们事先知道的非零变量和哪些变量在某些结果中具有相同效果的估计数一样。我们展示了我们的方法在广泛的数字研究和应用在大规模儿童医院使用神经损伤或疾病儿童人口的行政健康数据预测儿科病人的功能状况方面的表现。