The role of chest X-ray (CXR) imaging, due to being more cost-effective, widely available, and having a faster acquisition time compared to CT, has evolved during the COVID-19 pandemic. To improve the diagnostic performance of CXR imaging a growing number of studies have investigated whether supervised deep learning methods can provide additional support. However, supervised methods rely on a large number of labeled radiology images, which is a time-consuming and complex procedure requiring expert clinician input. Due to the relative scarcity of COVID-19 patient data and the costly labeling process, self-supervised learning methods have gained momentum and has been proposed achieving comparable results to fully supervised learning approaches. In this work, we study the effectiveness of self-supervised learning in the context of diagnosing COVID-19 disease from CXR images. We propose a multi-feature Vision Transformer (ViT) guided architecture where we deploy a cross-attention mechanism to learn information from both original CXR images and corresponding enhanced local phase CXR images. We demonstrate the performance of the baseline self-supervised learning models can be further improved by leveraging the local phase-based enhanced CXR images. By using 10\% labeled CXR scans, the proposed model achieves 91.10\% and 96.21\% overall accuracy tested on total 35,483 CXR images of healthy (8,851), regular pneumonia (6,045), and COVID-19 (18,159) scans and shows significant improvement over state-of-the-art techniques. Code is available https://github.com/endiqq/Multi-Feature-ViT
翻译:在COVID-19大流行期间,由于CXR成像(CXR)的成本效益较高、可广泛获得,而且获取时间比CT更快,因此胸部X射线(CXR)成像的作用在CVID-19大流行期间发生了变化。为了改进CXR成像的诊断性能,越来越多的研究调查了受监督的深层学习方法能否提供更多支持。然而,受监督的方法依赖于大量贴标签的放射图象,这是一个耗时和复杂的程序,需要临床专家投入。由于COVID-19病人数据相对缺乏,而且标签费用昂贵,自我监督的学习方法已经增加势头,并提议取得与完全监督下的学习方法相类似的成果。 在这项工作中,我们研究在从CXR图像中对COVID-19的自我监督学习是否有效。 我们提出一个多功能的图像变异(VT)指导结构,我们在那里部署一个跨保存机制,从原CXRR-19的病人数据中学习信息,并相应地加强地方的CXR-CX图像。我们展示了基线自我监督的成绩,通过利用C-Rxxx的常规图像,可以进一步改进整个C-Rx级图像。