结构性可变选择:使用口服抗凝胶剂确定住院高血压病人主要出血预测器,以进行性纤维化 (Structured variable selection: an application in identifying predictors of major bleeding among hospitalized hypertensive patients using oral anticoagulants for atrial fibrillation)

可辨认的 · 预测器/决策函数 · GROUP · MoDELS · Group Lasso ·

2022 年 6 月 10 日

Structured variable selection: an application in identifying predictors of major bleeding among hospitalized hypertensive patients using oral anticoagulants for atrial fibrillation

翻译：结构性可变选择:使用口服抗凝胶剂确定住院高血压病人主要出血预测器,以进行性纤维化

Guanbo Wang,Sylvie Perreault,Robert W. Platt,Rui Wang,Marc Dorais,Mireille E. Schnitzer

from arxiv, 45 pages, 1 figure

Predictor identification is important in medical research as it can help clinicians to have a better understanding of disease epidemiology and identify patients at higher risk of an outcome. Variable selection is often used to reduce the dimensionality of a prediction model. When conducting variable selection, it is often beneficial to take selection dependencies into account. Selection dependencies can help to improve model interpretability, increase the chance of recovering the true model, and augment the prediction accuracy of the resulting model. The latent overlapping group lasso can achieve the goal of incorporating some types of selection dependencies into variable selection by assigning coefficients to different groups of penalties. However, when the selection dependencies are complex, there is no roadmap for how to specify the groups of penalties. Wang et al. (2021) proposed a general framework for structured variable selection, and provided a condition to verify whether a penalty grouping respects a set of selection dependencies. Based on this previous work, we construct roadmaps to derive the grouping specification for some common selection dependencies and apply them to the problem of constructing a prediction model for major bleeding among hypertensive patients recently hospitalized for atrial fibrillation and then prescribed oral anticoagulants. In the application, we consider a proxy of adherence to anticoagulant medication and its interaction with dose and oral anticoagulants type, respectively. We also consider drug-drug interactions. Our method allows for algorithmic identification of the grouping specification even under the resulting complex selection dependencies.

翻译：在医学研究中,可预测性的识别很重要,因为潜在的重叠群体能够帮助临床医生更好地了解疾病流行病学,并查明结果风险较高的患者。变量选择通常用于降低预测模型的维度。在进行变量选择时,往往有利于考虑选择依赖性;选择依赖性有助于改进模型的可解释性,增加恢复真实模型的机会,并增强所产生模型的预测准确性。潜在的重叠群体关系可以通过为不同惩罚群体分配系数,实现将某些种类的复杂选择依赖性纳入变量选择的目标。然而,当选择依赖性很复杂时,对于如何具体确定惩罚类别没有路线图。王等人(2021年)提出了结构多样选择性选择的一般框架,并提供了一个条件,以核实一组惩罚是否尊重一套选择依赖性。根据以前的工作,我们制定路线图,为某些共同选择依赖性制定分组规格,并将这些规格用于为最近住院过重病人进行高血压的预测模式,从而导致采用口头选择性接触和口服定型兴奋剂。我们还考虑最近住院病人进行高血压治疗的方法,同时考虑使用口服兴奋剂和口服兴奋剂。