With the onset of the COVID-19 pandemic, ultrasound has emerged as an effective tool for bedside monitoring of patients. Due to this, a large amount of lung ultrasound scans have been made available which can be used for AI based diagnosis and analysis. Several AI-based patient severity scoring models have been proposed that rely on scoring the appearance of the ultrasound scans. AI models are trained using ultrasound-appearance severity scores that are manually labeled based on standardized visual features. We address the challenge of labeling every ultrasound frame in the video clips. Our contrastive learning method treats the video clip severity labels as noisy weak severity labels for individual frames, thus requiring only video-level labels. We show that it performs better than the conventional cross-entropy loss based training. We combine frame severity predictions to come up with video severity predictions and show that the frame based model achieves comparable performance to a video based TSM model, on a large dataset combining public and private sources.
翻译:随着COVID-19大流行的开始,超声波已成为对病人进行床边监测的有效工具。因此,已经提供了大量肺部超声波扫描,可用于AI的诊断和分析。一些AI型病人重度评分模型已经提出,依靠超声波扫描的外观来评分。AI模型经过培训,使用基于标准化视觉特征的人工标签的超声波重分。我们处理在视频剪辑中贴上每个超声波框架的标签的挑战。我们的对比式学习方法把视频剪辑重度标签当作单个框的超音速弱度标签,因此只需要视频级标签。我们显示它的表现优于传统的跨热带损失培训。我们把框架重度预测与视频重度预测结合起来,并显示基于模型的功能与基于视频的TSM模型的类似性性能,以大型数据集为基础,将公共和私人来源结合起来。