Interpretation of machine learning models has become one of the most important research topics due to the necessity of maintaining control and avoiding bias in these algorithms. Since many machine learning algorithms are published every day, there is a need for novel model-agnostic interpretation approaches that could be used to interpret a great variety of algorithms. Thus, one advantageous way to interpret machine learning models is to feed different input data to understand the changes in the prediction. Using such an approach, practitioners can define relations among data patterns and a model's decision. This work proposes a model-agnostic interpretation approach that uses visualization of feature perturbations induced by the PSO algorithm. We validate our approach on publicly available datasets, showing the capability to enhance the interpretation of different classifiers while yielding very stable results compared with state-of-the-art algorithms.
翻译:由于必须保持控制和避免这些算法中的偏差,机器学习模型的解释已成为最重要的研究课题之一。由于许多机器学习算法每天都在公布,因此有必要采用新颖的模型-不可知性解释方法来解释各种各样的算法。因此,解释机器学习模型的一个有利办法是提供不同的输入数据,以了解预测的变化。使用这种方法,实践者可以确定数据模式和模型决定之间的关系。这项工作提出了一个模型-不可知性解释方法,使用PSO算法引起的特征扰动的可视化方法。我们验证了我们关于公开提供的数据集的方法,表明有能力加强不同分类者的解释,同时产生与最新算法相比非常稳定的结果。