Variable selection problem for the nonlinear Cox regression model is considered. In survival analysis, one main objective is to identify the covariates that are associated with the risk of experiencing the event of interest. The Cox proportional hazard model is being used extensively in survival analysis in studying the relationship between survival times and covariates, where the model assumes that the covariate has a log-linear effect on the hazard function. However, this linearity assumption may not be satisfied in practice. In order to extract a representative subset of features, various variable selection approaches have been proposed for survival data under the linear Cox model. However, there exists little literature on variable selection for the nonlinear Cox model. To break this gap, we extend the recently developed deep learning-based variable selection model LassoNet to survival data. Simulations are provided to demonstrate the validity and effectiveness of the proposed method. Finally, we apply the proposed methodology to analyze a real data set on diffuse large B-cell lymphoma.
翻译:考虑非线性 Cox 回归模型的变量选择问题。 在生存分析中,一个主要目标是确定与经历感兴趣事件的风险相关的共变量。 Cox 比例危害模型正在广泛用于生存分析,以研究生存时间和共变量之间的关系,模型假设共变量对危害函数具有日志-线性影响。然而,这一线性假设在实践中可能无法得到满足。为了提取具有代表性的特征子集,已经为线性 Cox 模型下的生存数据提出了各种变量选择方法。然而,关于非线性 Cox 模型的变量选择,几乎没有文献。为打破这一差距,我们将最近开发的基于深学习的变量选择模型LassoNet扩大到生存数据。提供了模拟,以证明拟议方法的有效性和有效性。最后,我们应用了拟议方法来分析关于扩散大B细胞淋巴马的真实数据集。