环境感知的提示视觉转换器：基于领域泛化的皮肤病识别 (EPVT: Environment-aware Prompt Vision Transformer for Domain Generalization in Skin Lesion Recognition)

Skin lesion recognition using deep learning has made remarkable progress, and there is an increasing need for deploying these systems in real-world scenarios. However, recent research has revealed that deep neural networks for skin lesion recognition may overly depend on disease-irrelevant image artifacts (i.e. dark corners, dense hairs), leading to poor generalization in unseen environments. To address this issue, we propose a novel domain generalization method called EPVT, which involves embedding prompts into the vision transformer to collaboratively learn knowledge from diverse domains. Concretely, EPVT leverages a set of domain prompts, each of which plays as a domain expert, to capture domain-specific knowledge; and a shared prompt for general knowledge over the entire dataset. To facilitate knowledge sharing and the interaction of different prompts, we introduce a domain prompt generator that enables low-rank multiplicative updates between domain prompts and the shared prompt. A domain mixup strategy is additionally devised to reduce the co-occurring artifacts in each domain, which allows for more flexible decision margins and mitigates the issue of incorrectly assigned domain labels. Experiments on four out-of-distribution datasets and six different biased ISIC datasets demonstrate the superior generalization ability of EPVT in skin lesion recognition across various environments. Our code and dataset will be released at https://github.com/SiyuanYan1/EPVT.

翻译：深度学习在皮肤病识别方面取得了令人瞩目的进展，部署这些系统在现实世界场景中的需求越来越大。然而，最近的研究发现，用于皮肤病识别的深度神经网络可能过度依赖于与疾病无关的图像工件（例如暗角、密集头发），从而导致在看不见的环境中进行预测时鲁棒性很差。为了解决这个问题，我们提出了一种新颖的基于领域泛化的方法，称为 EPVT。它在视觉变压器中嵌入提示，以协作学习来自多个领域的知识。具体而言，EPVT 利用一组领域提示，每个提示都充当领域专家，以捕捉领域特定的知识；以及一个通用提示，用于整个数据集上的通用知识。为了促进知识共享和不同提示之间的相互作用，我们引入了一个领域提示生成器，它能够在领域提示和共享提示之间进行低秩乘法更新。此外，我们还设计了一种领域混合策略，以减少每个领域中同时出现的工件，从而实现更灵活的决策边缘，并减轻错误的领域标签分配问题。在四个分布外数据集和六个不同偏差的 ISIC 数据集上的实验证明了 EPVT 在不同环境下进行皮肤病识别方面的卓越泛化能力。我们的代码和数据集将在 https://github.com/SiyuanYan1/EPVT 上发布。