Healthcare providers often divide patient populations into cohorts based on shared clinical factors, such as medical history, to deliver personalized healthcare services. This idea has also been adopted in clinical prediction models, where it presents a vital challenge: capturing both global and cohort-specific patterns while enabling model generalization to unseen domains. Addressing this challenge falls under the scope of domain generalization (DG). However, conventional DG approaches often struggle in clinical settings due to the absence of explicit domain labels and the inherent gap in medical knowledge. To address this, we propose UdonCare, a hierarchy-guided method that iteratively divides patients into latent domains and decomposes domain-invariant (label) information from patient data. Our method identifies patient domains by pruning medical ontologies (e.g. ICD-9-CM hierarchy). On two public datasets, MIMIC-III and MIMIC-IV, UdonCare shows superiority over eight baselines across four clinical prediction tasks with substantial domain gaps, highlighting the untapped potential of medical knowledge in guiding clinical domain generalization problems.
翻译:医疗服务提供者通常根据共享的临床因素(如病史)将患者群体划分为不同队列,以提供个性化医疗服务。这一理念同样被应用于临床预测模型中,并带来一个关键挑战:在捕捉全局与队列特异性模式的同时,使模型能够泛化至未见域。这一挑战属于领域泛化(DG)的研究范畴。然而,传统DG方法在临床场景中常面临困难,原因在于缺乏显式领域标签及医学知识的内在鸿沟。为此,我们提出UdonCare——一种层次引导方法,通过迭代将患者划分至潜在域,并从患者数据中解耦领域不变(标签)信息。该方法通过剪枝医学本体(如ICD-9-CM层次结构)来识别患者域。在MIMIC-III和MIMIC-IV两个公开数据集上,UdonCare在四个存在显著领域差异的临床预测任务中均优于八种基线方法,彰显了医学知识在指导临床领域泛化问题中尚未开发的潜力。