In this paper we present SurvLIMEpy, an open-source Python package that implements the SurvLIME algorithm. This method allows to compute local feature importance for machine learning algorithms designed for modelling Survival Analysis data. Our implementation takes advantage of the parallelisation paradigm as all computations are performed in a matrix-wise fashion which speeds up execution time. Additionally, SurvLIMEpy assists the user with visualization tools to better understand the result of the algorithm. The package supports a wide variety of survival models, from the Cox Proportional Hazards Model to deep learning models such as DeepHit or DeepSurv. Two types of experiments are presented in this paper. First, by means of simulated data, we study the ability of the algorithm to capture the importance of the features. Second, we use three open source survival datasets together with a set of survival algorithms in order to demonstrate how SurvLIMEpy behaves when applied to different models.
翻译:在本文中,我们介绍 SurvLIMepy 是一个执行 SurvLIME 算法的开放源代码 Python 软件包。 这个方法可以计算用于模拟生存分析数据的机器学习算法的本地特性重要性。 我们的落实利用了平行模式, 因为所有计算都以矩阵方式进行, 加速了执行时间。 此外, SurvLIMepy 帮助用户使用可视化工具来更好地了解算法的结果。 这个软件包支持各种各样的生存模型, 从Cox 比例风险模型到DeepHit 或 DeepSurv 等深层学习模型。 本文介绍了两种类型的实验。 首先, 我们通过模拟数据, 我们研究算法的能力来捕捉这些特性的重要性。 其次, 我们使用三种开放源生存数据集以及一套生存算法来演示SurvLIMepy在应用不同模型时的行为方式。