This paper investigates the potential usage of large text-to-image (LTI) models for the automated diagnosis of a few skin conditions with rarity or a serious lack of annotated datasets. As the input to the LTI model, we provide the targeted instantiation of a generic but succinct prompt structure designed upon careful observations of the conditional narratives from the standard medical textbooks. In this regard, we pave the path to utilizing accessible textbook descriptions for automated diagnosis of conditions with data scarcity through the lens of LTI models. Experiments show the efficacy of the proposed framework, including much better localization of the infected regions. Moreover, it has the immense possibility for generalization across the medical sub-domains, not only to mitigate the data scarcity issue but also to debias automated diagnostics from the all-pervasive racial biases.
翻译:本文调查了使用大文本到图像模型对少数皮肤状况进行自动诊断的潜在可能性,这些模型很少或严重缺乏附加说明的数据集。作为对LTI模型的投入,我们提供有针对性的即时通用但简明的快速结构,这一结构是在仔细观察标准医学教科书的有条件说明后设计的。在这方面,我们铺平了道路,通过LTI模型的镜头,利用无障碍教科书描述自动诊断缺乏数据的状况。实验显示了拟议框架的功效,包括将受感染地区更好地本地化。此外,它还极有可能在整个医疗子领域普遍化,这不仅是为了减轻数据稀缺问题,而且是为了从普遍存在的种族偏见中消除自动诊断。