Deep learning models have shown a great effectiveness in recognition of findings in medical images. However, they cannot handle the ever-changing clinical environment, bringing newly annotated medical data from different sources. To exploit the incoming streams of data, these models would benefit largely from sequentially learning from new samples, without forgetting the previously obtained knowledge. In this paper we introduce LifeLonger, a benchmark for continual disease classification on the MedMNIST collection, by applying existing state-of-the-art continual learning methods. In particular, we consider three continual learning scenarios, namely, task and class incremental learning and the newly defined cross-domain incremental learning. Task and class incremental learning of diseases address the issue of classifying new samples without re-training the models from scratch, while cross-domain incremental learning addresses the issue of dealing with datasets originating from different institutions while retaining the previously obtained knowledge. We perform a thorough analysis of the performance and examine how the well-known challenges of continual learning, such as the catastrophic forgetting exhibit themselves in this setting. The encouraging results demonstrate that continual learning has a major potential to advance disease classification and to produce a more robust and efficient learning framework for clinical settings. The code repository, data partitions and baseline results for the complete benchmark will be made publicly available.
翻译:深层学习模式在承认医学图像中发现的结果方面显示出了巨大的有效性,然而,它们无法处理不断变化的临床环境,从不同来源带来新的附加说明的医疗数据。为了利用进取的数据流,这些模型将主要受益于从新样本中相继学习,而不会忘记以前获得的知识。在本文中,我们介绍了《生命宝书》,这是MedMNIST收藏中持续疾病分类的基准,采用了现有最先进的持续学习方法。我们特别考虑到三种不断学习的情景,即任务和班级递增学习以及新定义的跨领域递增学习。关于疾病的任务和班级递增学习解决了对新样本进行分类的问题,而没有从零开始对模型进行再培训,而交叉递增学习则解决了处理不同机构生成的数据集的问题,同时保留了以前获得的知识。我们对业绩进行了透彻分析,并研究了持续学习的众所周知的挑战,例如在这个环境中灾难性地忘记了自己的展览。令人鼓舞的结果表明,持续学习对于推进疾病分类和为临床环境创造更健全和高效学习框架具有重大的潜力。现有的代码库、数据宝座和基准将是一个完整的基准。