International challenges have become the de facto standard for comparative assessment of image analysis algorithms given a specific task. Segmentation is so far the most widely investigated medical image processing task, but the various segmentation challenges have typically been organized in isolation, such that algorithm development was driven by the need to tackle a single specific clinical problem. We hypothesized that a method capable of performing well on multiple tasks will generalize well to a previously unseen task and potentially outperform a custom-designed solution. To investigate the hypothesis, we organized the Medical Segmentation Decathlon (MSD) - a biomedical image analysis challenge, in which algorithms compete in a multitude of both tasks and modalities. The underlying data set was designed to explore the axis of difficulties typically encountered when dealing with medical images, such as small data sets, unbalanced labels, multi-site data and small objects. The MSD challenge confirmed that algorithms with a consistent good performance on a set of tasks preserved their good average performance on a different set of previously unseen tasks. Moreover, by monitoring the MSD winner for two years, we found that this algorithm continued generalizing well to a wide range of other clinical problems, further confirming our hypothesis. Three main conclusions can be drawn from this study: (1) state-of-the-art image segmentation algorithms are mature, accurate, and generalize well when retrained on unseen tasks; (2) consistent algorithmic performance across multiple tasks is a strong surrogate of algorithmic generalizability; (3) the training of accurate AI segmentation models is now commoditized to non AI experts.
翻译:国际挑战已成为对特定任务图像分析算法进行比较评估的实际标准。分解是迄今为止调查最广泛的医学图像处理任务,但不同的分解挑战通常是孤立地组织起来的,因此算法开发的驱动因素是解决单一具体临床问题的需要。我们假设,能够很好地完成多种任务的方法将广泛适用于以往所见的任务,并可能超越定制设计的解决办法。为了调查这一假设,我们组织了医学分解 Decathlon(MSD) 医学分解 Decathlon (MSD) - 生物医学图像分析挑战,在这一挑战中,算法在多种任务和模式中相互竞争。基础数据集的设计是为了探索在处理医疗图像时通常遇到的困难轴心,例如小型数据集、不平衡的标签、多地点数据和小物体。MSD挑战证实,在一系列任务上表现一贯良好的算法保持了过去一系列秘密任务的平均业绩。此外,我们发现这一算法在两年里继续广泛概括到非临床问题的范围,进一步证实我们的一般分类分析任务、三个主要结论,从一般的算法分析任务到长期的正确性分析任务。 三个主要结论可以从一个从一个成熟的学阶段到一个比较的递化的递定任务,从一个从一个比较分析任务,可以从一个成熟的研算。