Progressive diseases worsen over time and are characterised by monotonic change in features that track disease progression. Here we connect ideas from two formerly separate methodologies -- event-based and hidden Markov modelling -- to derive a new generative model of disease progression. Our model can uniquely infer the most likely group-level sequence and timing of events (natural history) from limited datasets. Moreover, it can infer and predict individual-level trajectories (prognosis) even when data are missing, giving it high clinical utility. Here we derive the model and provide an inference scheme based on the expectation maximisation algorithm. We use clinical, imaging and biofluid data from the Alzheimer's Disease Neuroimaging Initiative to demonstrate the validity and utility of our model. First, we train our model to uncover a new group-level sequence of feature changes in Alzheimer's disease over a period of ${\sim}17.3$ years. Next, we demonstrate that our model provides improved utility over a continuous time hidden Markov model by area under the receiver operator characteristic curve ${\sim}0.23$. Finally, we demonstrate that our model maintains predictive accuracy with up to $50\%$ missing data. These results support the clinical validity of our model and its broader utility in resource-limited medical applications.
翻译:渐进性疾病随着时间推移而恶化,其特点是跟踪疾病演变特征的单一变化。在这里,我们将两种先前独立的方法 -- -- 基于事件和隐藏的马尔科夫模型模型 -- -- 的概念与两种先前独立的方法 -- -- 基于事件和隐藏的马尔科夫模型 -- -- 联系起来,以产生一种新的疾病演变基因模型。我们的模型可以独特的地从有限的数据集中推断出最可能发生的群体级事件序列和时间(自然历史),此外,它可以推断和预测个体级轨迹(预测),即使数据缺失,也具有很高的临床效用。我们在这里,我们根据预期最大化算法来得出模型并提供一种推论方案。我们利用来自老年痴呆症疾病神经神经造影倡议的临床、成像和生物浮化数据来展示我们模型的有效性和效用。首先,我们训练我们的模型,以揭示在美元17.3年的时间里阿尔茨海默氏病特征变化的新群体级序列。接下来,我们证明我们的模型在接收者特征曲线0.23美元下连续隐藏的马尔科夫模型提供了更好的效用。最后,我们展示了模型在50美元的医疗数据中缺乏的精确性临床数据。