通过图形信息结构等式模型模型,对不同表达的基因路径进行强有力的测试 (A powerful test for differentially expressed gene pathways via graph-informed structural equation modeling)

A major task in genetic studies is to identify genes related to human diseases and traits to understand functional characteristics of genetic mutations and enhance patient diagnosis. Besides marginal analyses of individual genes, identification of gene pathways, i.e., a set of genes with known interactions that collectively contribute to specific biological functions, can provide more biologically meaningful results. Such gene pathway analysis can be formulated into a high-dimensional two-sample testing problem. Due to the typically limited sample size of gene expression datasets, most existing two-sample tests may have compromised powers because they ignore or only inefficiently incorporate the auxiliary pathway information on gene interactions. We propose T2-DAG, a Hotelling's $T^2$-type test for detecting differentially expressed gene pathways, which efficiently leverages the auxiliary pathway information on gene interactions through a linear structural equation model. We establish the asymptotic distribution of the test statistic under pertinent assumptions. Simulation studies under various scenarios show that T2-DAG outperforms several representative existing methods with well-controlled type-I error rates and substantially improved powers, even with incomplete or inaccurate pathway information or unadjusted confounding effects. We also illustrate the performance of T2-DAG in an application to detect differentially expressed KEGG pathways between different stages of lung cancer.

翻译：基因研究的一项主要任务是确定与人类疾病有关的基因和特征,以了解基因突变的功能特征,并加强病人诊断; 除了对个别基因进行边际分析外,查明基因路径,即一组已知相互作用的基因,可以提供更具有生物学意义的结果; 这种基因路径分析可以形成一个高维的二类抽样测试问题; 由于基因表达数据集的抽样规模通常有限,大多数现有的双类测试可能具有妥协的权力,因为它们忽视或只是没有有效地纳入基因相互作用的辅助路径信息; 我们提议T2-DAG, 一家旅馆的$T ⁇ 2美元类型的测试,用于检测不同表达的基因路径,通过线性结构等式模型有效地利用基因相互作用的辅助路径信息; 我们根据有关假设确定试验统计的无症状分布; 各种假设下的模拟研究表明,T2-DAG超越了几种具有良好控制的类型一型错误率和显著改进的功能,甚至以不完全或不精确的路径信息或不精确的路径测量T2-D型癌症的不同性能。

相关内容

结构方程模型(Structural Equation Modeling)

关注 4

结构方程模型（Structural Equation Modeling,SEM）是一种建立、估计和检验因果关系模型的方法。模型中既包含有可观测的显在变量，也可能包含无法直接观测的潜在变量。结构方程模型可以替代多重回归、通径分析、因子分析、协方差分析等方法，清晰分析单项指标对总体的作用和单项指标间的相互关系。

【博士论文】开放环境下的度量学习研究

专知会员服务

49+阅读 · 2021年12月4日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日