Medical image datasets and their annotations are not growing as fast as their equivalents in the general domain. This makes translation from the newest, more data-intensive methods that have made a large impact on the vision field increasingly more difficult and less efficient. In this paper, we propose a new probabilistic latent variable model for disease classification in chest X-ray images. Specifically we consider chest X-ray datasets that contain global disease labels, and for a smaller subset contain object level expert annotations in the form of eye gaze patterns and disease bounding boxes. We propose a two-stage optimization algorithm which is able to handle these different label granularities through a single training pipeline in a two-stage manner. In our pipeline global dataset features are learned in the lower level layers of the model. The specific details and nuances in the fine-grained expert object-level annotations are learned in the final layers of the model using a knowledge distillation method inspired by conditional variational inference. Subsequently, model weights are frozen to guide this learning process and prevent overfitting on the smaller richly annotated data subsets. The proposed method yields consistent classification improvement across different backbones on the common benchmark datasets Chest X-ray14 and MIMIC-CXR. This shows how two-stage learning of labels from coarse to fine-grained, in particular with object level annotations, is an effective method for more optimal annotation usage.
翻译:医学图像数据集及其说明没有像一般领域的同等数据那样快速增长。 这使得对视觉领域产生重大影响的最新、 更多数据密集方法的翻译越来越困难, 效率越来越低。 在本文件中, 我们提出了一个新的风险潜在潜在变数模型, 用于胸前X光图像中的疾病分类。 具体地说, 我们考虑的是包含全球疾病标签的胸X光数据集, 对于一个较小的子组来说, 胸X光数据集含有以眼视模式和疾病捆绑框形式呈现的物体级专家说明。 我们提议了一种两阶段优化算法,能够通过两阶段的单一培训管道处理这些不同的标签颗粒。 在我们的管道全球数据集中,在模型的较低层次学习。 在模型的最后一层,我们学习了含有全球疾病标签的精细专家对象级说明中的具体细节和细微之处, 使用由有条件变异推断所激发的知识蒸馏方法, 之后, 模型重量被冻结来指导这一学习过程, 并防止在较小型的附加数据子组中过度地使用。 拟议的方法在不同的基底级标准上,, 在不同的基底级标准上, 将这种方法在不同的基底级标签上进行不断改进。