Medical Visual Question Answering (VQA) is a combination of medical artificial intelligence and popular VQA challenges. Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer. Although the general-domain VQA has been extensively studied, the medical VQA still needs specific investigation and exploration due to its task features. In the first part of this survey, we cover and discuss the publicly available medical VQA datasets up to date about the data source, data quantity, and task feature. In the second part, we review the approaches used in medical VQA tasks. In the last part, we analyze some medical-specific challenges for the field and discuss future research directions.
翻译:医学视觉问题解答(VQA)是医学人工智能和流行的VQA挑战的组合。考虑到医学形象和临床上与自然语言相关的问题,医学VQA系统预计将预测出一个可信和令人信服的答案。虽然对一般数据解答(VQA)进行了广泛研究,但医学VQA仍因其任务特点需要进行具体调查和探索。在本次调查的第一部分,我们报道并讨论公开提供的医学VQA数据集,该数据集涉及数据来源、数据数量和任务特点。在第二部分,我们审查了医疗VQA任务中使用的方法。在最后一部分,我们分析实地的一些具体医疗挑战,并讨论未来的研究方向。