In this paper we present SurvLIMEpy, an open-source Python package that implements the SurvLIME algorithm. This method allows to compute local feature importance for machine learning algorithms designed for modelling Survival Analysis data. Our implementation takes advantage of the parallelisation paradigm as all computations are performed in a matrix-wise fashion which speeds up execution time. Additionally, SurvLIMEpy assists the user with visualization tools to better understand the result of the algorithm. The package supports a wide variety of survival models, from the Cox Proportional Hazards Model to deep learning models such as DeepHit or DeepSurv. Two types of experiments are presented in this paper. First, by means of simulated data, we study the ability of the algorithm to capture the importance of the features. Second, we use three open source survival datasets together with a set of survival algorithms in order to demonstrate how SurvLIMEpy behaves when applied to different models.
翻译:在本文中,我们介绍了SurvLIMEpy,一个开源的Python包,它实现了SurvLIME 算法。该方法允许计算针对生存分析数据建模的机器学习算法的局部特征重要性。我们的实现利用了并行化范例,因为所有的计算都是以矩阵方式执行的,这加快了执行时间。此外,SurvLIMEpy协助用户使用可视化工具更好地理解算法的结果。该程序支持各种生存模型,从Cox比例风险模型到DeepHit或DeepSurv等深度学习模型。本文介绍了两种类型的实验。首先,通过模拟数据,我们研究了该算法捕获特征重要性的能力。其次,我们使用三个开源生存数据集以及一组生存算法,以证明SurvLIMEpy应用于不同模型时的行为。