This paper addresses patient heterogeneity associated with prediction problems in biomedical applications. We propose a systematic hypothesis testing approach to determine the existence of patient subgroup structure and the number of subgroups in patient population if subgroups exist. A mixture of generalized linear models is considered to model the relationship between the disease outcome and patient characteristics and clinical factors, including targeted biomarker profiles. We construct a test statistic based on expectation maximization (EM) algorithm and derive its asymptotic distribution under the null hypothesis. An important computational advantage of the test is that the involved parameter estimates under the complex alternative hypothesis can be obtained through a small number of EM iterations, rather than optimizing the objective function. We demonstrate the finite sample performance of the proposed test in terms of type-I error rate and power, using extensive simulation studies. The applicability of the proposed method is illustrated through an application to a multi-center prostate cancer study.
翻译:本文论述与生物医学应用预测问题有关的病人异质性。我们建议采用系统性假设测试方法,以确定是否存在病人分组结构,如果存在分组的话,病人群体中的分组数目。我们考虑将一般线性模型混在一起,以模拟疾病结果与病人特征和临床因素之间的关系,包括有针对性的生物标志剖面。我们根据预期最大化算法建立一个测试统计数据,并根据无效假设得出其无症状分布。测试的一个重要计算优势是,通过少量的EM迭代,而不是优化客观功能,可以获得复杂替代假设下的所涉参数估计。我们利用广泛的模拟研究,从类型一误差率和功率方面展示了拟议测试的有限样本性表现。我们通过应用多中点前列癌研究来说明拟议方法的适用性。