Epidemiologic and genetic studies in chronic obstructive pulmonary disease (COPD) and many complex diseases suggest subgroup disparities (e.g., by sex). We consider this problem from the standpoint of integrative analysis where we combine information from different views (e.g., genomics, proteomics, clinical data). Existing integrative analysis methods ignore the heterogeneity in subgroups, and stacking the views and accounting for subgroup heterogeneity does not model the association among the views. To address analytical challenges in the problem of our interest, we propose a statistical approach for joint association and prediction that leverages the strengths in each view to identify molecular signatures that are shared by and specific to males and females and that contribute to the variation in COPD, measured by airway wall thickness. HIP (Heterogeneity in Integration and Prediction) accounts for subgroup heterogeneity, allows for sparsity in variable selection, is applicable to multi-class and to univariate or multivariate continuous outcomes, and incorporates covariate adjustment. We develop efficient algorithms in PyTorch. Our COPD findings have identified several proteins, genes, and pathways that are common and specific to males and females, some of which have been implicated in COPD, while others could lead to new insights into sex differences in COPD mechanisms.
翻译:慢性阻塞性肺病(COPD)和许多复杂疾病的流行病学和遗传学研究表明,存在分组差异(例如,按性别)。我们从综合分析的角度来考虑这一问题,我们综合分析将不同观点的信息(例如,基因组学、蛋白质组学、临床数据)结合起来。现有的综合分析方法忽视了子群中的异质性,并堆叠观点和核算子群异质性,无法模拟各种观点之间的关联。为了应对我们感兴趣的问题的分析挑战,我们提议了一个联合联系和预测的统计方法,利用每种观点的优势确定男女共享和特有的分子特征,并且有助于根据气道壁厚度衡量的COPD差异。HIP(融合和预测中的异质性)分组异质性性核算,允许差异性选择,适用于多级和单级或多变性持续结果,并纳入变量调整。我们在PyTorch中制定高效的性别算法,而CCOPD发现中的一些共同的遗传和变异性,而CEVD还发现了一些共同的基因和变异性。