Heterogeneity is a hallmark of many complex diseases. There are multiple ways of defining heterogeneity, among which the heterogeneity in genetic regulations, for example GEs (gene expressions) by CNVs (copy number variations) and methylation, has been suggested but little investigated. Heterogeneity in genetic regulations can be linked with disease severity, progression, and other traits and is biologically important. However, the analysis can be very challenging with the high dimensionality of both sides of regulation as well as sparse and weak signals. In this article, we consider the scenario where subjects form unknown subgroups, and each subgroup has unique genetic regulation relationships. Further, such heterogeneity is "guided" by a known biomarker. We develop an MSF (Multivariate Sparse Fusion) approach, which innovatively applies the penalized fusion technique to simultaneously determine the number and structure of subgroups and regulation relationships within each subgroup. An effective computational algorithm is developed, and extensive simulations are conducted. The analysis of heterogeneity in the GE-CNV regulations in melanoma and GE-methylation regulations in stomach cancer using the TCGA data leads to interesting findings.
翻译:多种不同性是许多复杂疾病的特征。 不同性是许多复杂疾病的特征。 存在多种定义异质性的方法,其中包括基因条例中的异质性,例如基因条例中的GES(基因表达方式),例如CNVs(序号变异)和甲基化,已经建议但很少调查。 遗传条例中的异质性可以与疾病严重性、累进性和其他特征相联系,并且具有生物重要性。 但是,这种分析可能非常具有挑战性,因为监管的两侧都有高度的维度,而且信号稀少和薄弱。 在本条中,我们考虑了主体组成未知分组和每个分组具有独特的遗传管理关系的情景。 此外,这种异质性是由已知的生物标志“引导的 ” 。 我们开发了MSF(MSF(Multivriate Sprocession) ) 方法,该方法创新地运用了受罚的聚变技术,同时确定各分组的数目和结构以及每个分组的监管关系。 开发了有效的计算算法,并进行了广泛的模拟。 在GE-CNV条例中,对Gelanoma和GE- main- main- main- main- mindexemate roduction 中, roducismismus。