UICLAM:统一和可解释的医学愿景问题解答中以反向蒙面为主的间接代表性学习 (UnICLAM:Contrastive Representation Learning with Adversarial Masking for Unified and Interpretable Medical Vision Question Answering) - 专知论文

会员服务 ·

0

Learning · 掩码 · Performer · Vision · contrastive ·

2022 年 12 月 21 日

UnICLAM:Contrastive Representation Learning with Adversarial Masking for Unified and Interpretable Medical Vision Question Answering

翻译：UICLAM:统一和可解释的医学愿景问题解答中以反向蒙面为主的间接代表性学习

Chenlu Zhan,Peng Peng,Hongsen Wang,Tao Chen,Hongwei Wang

Medical Visual Question Answering (Medical-VQA) aims to answer clinical questions regarding radiology images, assisting doctors with decision-making options. Nevertheless, current Medical-VQA models learn cross-modal representations through residing vision and texture encoders in dual separate spaces, which lead to indirect semantic alignment. In this paper, we propose UnICLAM, a Unified and Interpretable Medical-VQA model through Contrastive Representation Learning with Adversarial Masking. Specifically, to learn an aligned image-text representation, we first establish a unified dual-stream pre-training structure with the gradually soft-parameter sharing strategy. Technically, the proposed strategy learns a constraint for the vision and texture encoders to be close in a same space, which is gradually loosened as the higher number of layers. Moreover, for grasping the semantic representation, we extend the unified Adversarial Masking data augmentation strategy to the contrastive representation learning of vision and text in a unified manner, alleviating the meaningless of the commonly used random mask. Concretely, while the encoder training minimizes the distance between the original feature and the masking feature, the adversarial masking model keeps adversarial learning to conversely maximize the distance. Furthermore, we also intuitively take a further exploration of the unified adversarial masking strategy, which improves the potential ante-hoc interpretability with remarkable performance and efficiency. Experimental results on VQA-RAD and SLAKE public benchmarks demonstrate that UnICLAM outperforms the existing 11 state-of-the-art Medical-VQA models. More importantly, we make an additional discussion about the performance of UnICLAM in diagnosing heart failure, verifying that UnICLAM exhibits superior few-shot adaption performance in practical disease diagnosis.

翻译：医学视觉问答(Medical-VQA)旨在解答有关放射图像的临床问题,协助医生做出决策选择;然而,目前的医学VQA模型通过在双独立空间的内置视觉和纹理编码器学习跨模式表现,这导致间接的语义一致。在本文中,我们建议UniICLAM,一个通过反向遮罩进行对比演示学习的统一和解释的医学-VQA模型。具体地说,为了学习一致的图像文本代表制,我们首先建立一个统一的双流培训前结构,同时采用逐渐软参数共享战略。技术上,拟议的战略通过在双独立空间学习双向图像和纹理编码,学习跨双向的视觉和纹理描述,同时,对视觉和纹理进行更深层次的演示。

0

相关内容

Learning

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

剪应力响应的二氢杨梅素Pickering乳液用于治疗动脉粥样硬化的研究

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

IL-32/Integrins/FAK通路在肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

融合自组装双亲短肽提高Pseudomonas aeruginosa脂肪氧合酶热稳定性机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

ING3：原发性肝癌的诊断与治疗新靶点

国家自然科学基金

0+阅读 · 2012年12月31日

基于双重网络嵌入与价值模块整合的物联网商业模式构建机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

NiMnInCo合金薄膜的磁驱动马氏体相变及磁感生应变研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Medical Face Masks and Emotion Recognition from the Body: Insights from a Deep Learning Perspective

Arxiv

0+阅读 · 2023年2月20日

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Arxiv

0+阅读 · 2023年2月20日

Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning

Arxiv

0+阅读 · 2023年2月19日

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Arxiv

20+阅读 · 2023年2月1日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Arxiv

28+阅读 · 2022年6月8日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Interpretable CNNs for Object Classification

Interpretable CNNs for Object Classification

Arxiv

20+阅读 · 2020年3月12日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Medical Face Masks and Emotion Recognition from the Body: Insights from a Deep Learning Perspective

Arxiv

0+阅读 · 2023年2月20日

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Unified Vision-Language Representation Modeling for E-Commerce Same-Style Products Retrieval

Arxiv

0+阅读 · 2023年2月20日

Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning

Arxiv

0+阅读 · 2023年2月19日

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Arxiv

20+阅读 · 2023年2月1日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Beyond Just Vision: A Review on Self-Supervised Representation Learning on Multimodal and Temporal Data

Arxiv

28+阅读 · 2022年6月8日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Interpretable CNNs for Object Classification

Interpretable CNNs for Object Classification

Arxiv

20+阅读 · 2020年3月12日

相关基金

剪应力响应的二氢杨梅素Pickering乳液用于治疗动脉粥样硬化的研究

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑绝缘体与超导体耦合体系中交叉Andreev反射研究

国家自然科学基金

1+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

IL-32/Integrins/FAK通路在肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

融合自组装双亲短肽提高Pseudomonas aeruginosa脂肪氧合酶热稳定性机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

ING3：原发性肝癌的诊断与治疗新靶点

国家自然科学基金

0+阅读 · 2012年12月31日

基于双重网络嵌入与价值模块整合的物联网商业模式构建机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

NiMnInCo合金薄膜的磁驱动马氏体相变及磁感生应变研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员