ViPTT-Net:胸前CT扫描中结核病类型分类的pastio-时际模型视频预培训 (ViPTT-Net: Video pretraining of spatio-temporal model for tuberculosis type classification from chest CT scans)

Pretraining has sparked groundswell of interest in deep learning workflows to learn from limited data and improve generalization. While this is common for 2D image classification tasks, its application to 3D medical imaging tasks like chest CT interpretation is limited. We explore the idea of whether pretraining a model on realistic videos could improve performance rather than training the model from scratch, intended for tuberculosis type classification from chest CT scans. To incorporate both spatial and temporal features, we develop a hybrid convolutional neural network (CNN) and recurrent neural network (RNN) model, where the features are extracted from each axial slice of the CT scan by a CNN, these sequence of image features are input to a RNN for classification of the CT scan. Our model termed as ViPTT-Net, was trained on over 1300 video clips with labels of human activities, and then fine-tuned on chest CT scans with labels of tuberculosis type. We find that pretraining the model on videos lead to better representations and significantly improved model validation performance from a kappa score of 0.17 to 0.35, especially for under-represented class samples. Our best method achieved 2nd place in the ImageCLEF 2021 Tuberculosis - TBT classification task with a kappa score of 0.20 on the final test set with only image information (without using clinical meta-data). All codes and models are made available.

翻译：培训前对深层次学习工作流程产生了兴趣,以便从有限的数据中学习,并改进一般化。虽然2D图像分类任务通常如此,但对胸腔CT解释等3D医学成像任务的应用有限。我们探讨一个想法,即对现实视频模型进行预培训,是否可以提高性能,而不是从零开始培训模型,以便从头到尾从胸部CT扫描中进行结核病类分类。为了纳入空间和时间特点,我们开发了一个混合神经神经网络(CNN)和经常神经网络(RNNN)模型,其中通过CNN从CT扫描的每个轴切片中提取了这些特征,但这些图像特征的序列是用于3DX扫描分类的RNN的输入。我们称为VIPTTT-Net的模型是1300多个视频剪辑,目的是用人类活动的标签进行“从头”的分类。我们发现,对视频模型进行预先培训,可以导致更好的展示,并且大大改进模型的验证性能,从0.17至0.35分的KAP分分中提取,特别是代表不足的临床扫描。我们称为VPTLTTTTTTTTT的模型的模型,我们的最佳方法是在20-CS-CS-CS-CS-CS-CS-CS-CS-S-CRS-S-S-S-C-S-S-S-S-S-S-S-S-S-S-SlVDS-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx所有最后的20S-S-Sxxxxxxx的20的升级的20的升级的升级的升级的升级的升级的升级的升级的升级的升级的升级的所有标准。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日