Modern machine learning pipelines, in particular those based on deep learning (DL) models, require large amounts of labeled data. For classification problems, the most common learning paradigm consists of presenting labeled examples during training, thus providing strong supervision on what constitutes positive and negative samples. This constitutes a major obstacle for the development of DL models in radiology--in particular for cross-sectional imaging (e.g., computed tomography [CT] scans)--where labels must come from manual annotations by expert radiologists at the image or slice-level. These differ from examination-level annotations, which are coarser but cheaper, and could be extracted from radiology reports using natural language processing techniques. This work studies the question of what kind of labels should be collected for the problem of intracranial hemorrhage detection in brain CT. We investigate whether image-level annotations should be preferred to examination-level ones. By framing this task as a multiple instance learning problem, and employing modern attention-based DL architectures, we analyze the degree to which different levels of supervision improve detection performance. We find that strong supervision (i.e., learning with local image-level annotations) and weak supervision (i.e., learning with only global examination-level labels) achieve comparable performance in examination-level hemorrhage detection (the task of selecting the images in an examination that show signs of hemorrhage) as well as in image-level hemorrhage detection (highlighting those signs within the selected images). Furthermore, we study this behavior as a function of the number of labels available during training. Our results suggest that local labels may not be necessary at all for these tasks, drastically reducing the time and cost involved in collecting and curating datasets.
翻译:现代机器学习管道,特别是基于深层学习(DL)模型的现代机器学习管道,需要大量标签数据。对于分类问题,最常见的学习范式是在培训期间展示贴标签的例子,从而对什么是正和负样本进行有力的监督。这构成了在放射学中开发DL模型的主要障碍,特别是用于跨部门成像(例如,计算透视扫描)——标签必须来自图像或切片级专家放射科专家的人工说明。这些与考试级别的说明不同,这些说明比较粗糙,但更便宜,可以使用自然语言处理技术从放射学报告中提取。这项工作研究应收集何种标签来应对脑CT的内出血问题。我们调查的是,图像水平说明是否比考试更可取。通过将这项任务描述成一个多实例学习问题,并采用现代的DL结构。我们分析监督水平的不同说明,这些说明与监督水平不同,我们发现在实验室一级进行严格的监督(例如,在测试期间,通过测试,通过测试,通过测试,通过测试,通过测试,我们测测测测,这些测试,这些测试,只能通过测测测测测,这些结果,从当地的标记级别,我们所需的记录,这些级别上,这些记录,我们只能显示。