Knee OsteoArthritis (KOA) is a prevalent musculoskeletal disorder that causes decreased mobility in seniors. The lack of sufficient data in the medical field is always a challenge for training a learning model due to the high cost of labelling. At present, deep neural network training strongly depends on data augmentation to improve the model's generalization capability and avoid over-fitting. However, existing data augmentation operations, such as rotation, gamma correction, etc., are designed based on the data itself, which does not substantially increase the data diversity. In this paper, we proposed a novel approach based on the Vision Transformer (ViT) model with Selective Shuffled Position Embedding (SSPE) and a ROI-exchange strategy to obtain different input sequences as a method of data augmentation for early detection of KOA (KL-0 vs KL-2). More specifically, we fixed and shuffled the position embedding of ROI and non-ROI patches, respectively. Then, for the input image, we randomly selected other images from the training set to exchange their ROI patches and thus obtained different input sequences. Finally, a hybrid loss function was derived using different loss functions with optimized weights. Experimental results show that our proposed approach is a valid method of data augmentation as it can significantly improve the model's classification performance.
翻译:基于ROI交换策略的选择性随机位置嵌入Transformer用于早期检测膝骨关节炎
膝骨关节炎(KOA)是一种常见的肌肉骨骼疾病,导致老年人运动能力下降。医学领域中缺乏充足的数据,导致标注成本高,这一直是训练学习模型的挑战。目前,深度神经网络训练依赖于数据增强来提高模型的泛化能力并避免过拟合。然而,现有的数据增强操作,如旋转、伽马校正等,是基于数据本身设计的,不能实质上增加数据多样性。在本文中,我们提出了一种基于Vision Transformer(ViT)模型的新方法,使用ROI交换策略和选择性随机位置嵌入(SSPE),以获得不同的输入序列作为早期检测KOA(KL-0 vs KL-2)的数据增强方法。具体而言,我们固定和随机ROI和非ROI补丁的位置嵌入,然后针对输入图像,随机选择训练集中的其他图像来交换它们的ROI补丁,从而获得不同的输入序列。最后,使用不同的损失函数建立混合损失函数,优化权重。实验结果表明,我们提出的方法是一种有效的数据增强方法,可以显著提高模型的分类性能。