Complex diseases are caused by a multitude of factors that may differ between patients even within the same diagnostic category. A few underlying root causes may nevertheless initiate the development of disease within each patient. We therefore focus on identifying patient-specific root causes of disease, which we equate to the sample-specific predictivity of the exogenous error terms in a structural equation model. We generalize from the linear setting to the heteroscedastic noise model where $Y = m(X) + \varepsilon\sigma(X)$ with non-linear functions $m(X)$ and $\sigma(X)$ representing the conditional mean and mean absolute deviation, respectively. This model preserves identifiability but introduces non-trivial challenges that require a customized algorithm called Generalized Root Causal Inference (GRCI) to extract the error terms correctly. GRCI recovers patient-specific root causes more accurately than existing alternatives.
翻译:复杂疾病是由许多因素造成的,即使在同一诊断类别内,患者之间也可能存在差异。但有几个根本原因可能引发每个患者的疾病发展。因此,我们侧重于确定具体患者的病因根源,这相当于结构方程模型中外出错误术语的抽样预测。我们从线性设置到异性噪声模型,其中用非线性功能计算$Y =m(X) +\varepsilon\sigma(X)$,用非线性函数计算为$m(X)和$\sigma(X)$,分别代表有条件的平均值和平均值的绝对偏差。这个模型保留了可识别性,但提出了非三重挑战,需要一种定制的算法,称为通用根开源开源开源法,以正确提取错误术语。GRCI回收具体患者的根源比现有替代品更准确。