The development of deep learning models in medical image analysis is majorly limited by the lack of large-sized and well-annotated datasets. Unsupervised learning does not require labels and is more suitable for solving medical image analysis problems. However, most of the current unsupervised learning methods need to be applied to large datasets. To make unsupervised learning applicable to small datasets, we proposed Swin MAE, which is a masked autoencoder with Swin Transformer as its backbone. Even on a dataset of only a few thousand medical images and without using any pre-trained models, Swin MAE is still able to learn useful semantic features purely from images. It can equal or even slightly outperform the supervised model obtained by Swin Transformer trained on ImageNet in terms of the transfer learning results of downstream tasks. The code is publicly available at https://github.com/Zian-Xu/Swin-MAE.
翻译:在医学图像分析方面,发展深层次学习模式主要由于缺乏大型和有良好说明的数据集而受到严重限制。未经监督的学习不需要标签,更适合解决医学图像分析问题。然而,目前大多数未经监督的学习方法需要应用于大型数据集。为了使未经监督的学习适用于小型数据集,我们提议Swin MAE,这是一个蒙面的自动编码器,其主干线是Swin 变换器。即使只有几千个医学图像的数据集,而且不使用任何预先培训的模型,Swin MAE仍然能够纯粹从图像中学习有用的语义特征。它可以等同甚至略优于Swin 变换器在图象网培训的下游任务转移学习结果方面所获得的监管模型。该代码可在https://github.com/Zian-Xu/Swin-MAE公开查阅。