Over the last few years, convolutional neural networks (CNNs) have dominated the field of computer vision thanks to their ability to extract features and their outstanding performance in classification problems, for example in the automatic analysis of X-rays. Unfortunately, these neural networks are considered black-box algorithms, i.e. it is impossible to understand how the algorithm has achieved the final result. To apply these algorithms in different fields and test how the methodology works, we need to use eXplainable AI techniques. Most of the work in the medical field focuses on binary or multiclass classification problems. However, in many real-life situations, such as chest X-rays, radiological signs of different diseases can appear at the same time. This gives rise to what is known as "multilabel classification problems". A disadvantage of these tasks is class imbalance, i.e. different labels do not have the same number of samples. The main contribution of this paper is a Deep Learning methodology for imbalanced, multilabel chest X-ray datasets. It establishes a baseline for the currently underutilised PadChest dataset and a new eXplainable AI technique based on heatmaps. This technique also includes probabilities and inter-model matching. The results of our system are promising, especially considering the number of labels used. Furthermore, the heatmaps match the expected areas, i.e. they mark the areas that an expert would use to make the decision.
翻译:过去几年来,由于能够提取特征和在分类问题(例如X光的自动分析)方面的出色表现,遗传神经网络(CNNs)在计算机视觉领域占据了主导地位。 不幸的是,这些神经网络被视为黑箱算法,即无法理解算法是如何取得最终结果的。要在不同领域应用这些算法并测试方法如何运作,我们需要使用可扩展的AI技术。医学领域的大部分工作侧重于二进制或多级分类问题。然而,在许多现实生活中,如胸透射线,不同疾病的辐射迹象可以同时出现。这导致了所谓的“多标签分类问题”。这些任务的一个缺点是阶级不平衡,即不同的标签没有相同数量的样本。本文的主要贡献是用于不平衡、多标签胸X射线数据集的深度学习方法。它为目前被淡化的PadCest数据设置和新模型的匹配领域确定了基线,这特别包括了我们所使用的电子格式化标签的预期结果。