Computed tomography (CT) imaging could be very practical for diagnosing various diseases. However, the nature of the CT images is even more diverse since the resolution and number of the slices of a CT scan are determined by the machine and its settings. Conventional deep learning models are hard to tickle such diverse data since the essential requirement of the deep neural network is the consistent shape of the input data. In this paper, we propose a novel, effective, two-step-wise approach to tickle this issue for COVID-19 symptom classification thoroughly. First, the semantic feature embedding of each slice for a CT scan is extracted by conventional backbone networks. Then, we proposed a long short-term memory (LSTM) and Transformer-based sub-network to deal with temporal feature learning, leading to spatiotemporal feature representation learning. In this fashion, the proposed two-step LSTM model could prevent overfitting, as well as increase performance. Comprehensive experiments reveal that the proposed two-step method not only shows excellent performance but also could be compensated for each other. More specifically, the two-step LSTM model has a lower false-negative rate, while the 2-step Swin model has a lower false-positive rate. In summary, it is suggested that the model ensemble could be adopted for more stable and promising performance in real-world applications.
翻译:然而,由于CT扫描的分辨率和片段数是由机器及其设置决定的,CT图像的性质甚至更加多样化。常规深层学习模型很难对此类不同数据进行剪切,因为深神经网络的基本要求是输入数据的一致形状。在本文件中,我们建议采用一种新颖的、有效的、两步方法来彻底调整这一问题,以便进行COVID-19症状分类。首先,将CT扫描的每个片段嵌入的语义特征由常规主干网来提取。然后,我们建议采用一个长期短期内存(LSTM)和基于变换器的子网络来处理时间性特征学习,从而导致模拟时空特征表现学习。这样,拟议的LSTM双步模型可以防止过度调整,同时提高性能。全面实验表明,拟议的两步方法不仅表现出色,而且可以相互补偿。更具体地说,两步制LSTM模型的模拟性能更低,而其模拟性能则更稳定,而模拟性能更低的SNSTM模型是低的。