Oversight AI is an emerging concept in radiology where the AI forms a symbiosis with radiologists by continuously supporting radiologists in their decision-making. Recent advances in vision-language pre-training sheds a light on the long-standing problems of the oversight AI by the understanding of both visual and textual concepts and their semantic correspondences. However, there have been limited successes in the application of vision-language pre-training in the medical domain, as the current vision-language models and learning strategies for photographic images and captions are not optimal to process the medical data that are usually insufficient in the amount and the diversity. To address this, here we present medical X-VL, a self-supervised model tailored for efficient vision-language pre-training that exploits cross attention in the radiological images and reports' common feature space in a symmetric manner. We experimentally demonstrate that the pre-trained medical X-VL model outperforms the current state-of-the-art models in various vision-language tasks in medical domains. We finally demonstrate practical clinical usages of our oversight AI for monitoring human errors and in the diagnosis of newly emerging diseases, which suggests the potential of an oversight AI model for widespread applicability in different medical applications.
翻译:在放射学中,AI与放射学家形成共生关系,不断支持放射学家的决策,从而与放射学家形成共生关系。最近,通过理解视觉和文字概念及其语义对应物,通过理解视觉和文字概念及其语义对应物,在前导师监督方面长期存在问题。然而,在医疗领域应用前导师培训方面,成效有限,因为目前摄影图像和字幕的视觉语言模型和学习战略不足以处理通常数量和多样性不足的医疗数据。为了解决这个问题,我们在此提出医疗X-VL, 这是一种自我监督的模型,专门设计用于高效的视觉语言前导师培训,在辐射图像和报告共同特征空间中以对称方式进行交叉关注。我们实验性地证明,预先培训的医学X-VL模型超越了医学领域各种视觉语言任务中目前最先进的模型。我们最后展示了监督人类错误和新出现疾病的广泛应用性诊断的临床监督模式的实际使用情况。