Skincon: 由精细谷物模型调试和分析的域专家对皮肤疾病数据集进行密集说明 (SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis)

For the deployment of artificial intelligence (AI) in high-risk settings, such as healthcare, methods that provide interpretability/explainability or allow fine-grained error analysis are critical. Many recent methods for interpretability/explainability and fine-grained error analysis use concepts, which are meta-labels that are semantically meaningful to humans. However, there are only a few datasets that include concept-level meta-labels and most of these meta-labels are relevant for natural images that do not require domain expertise. Densely annotated datasets in medicine focused on meta-labels that are relevant to a single disease such as melanoma. In dermatology, skin disease is described using an established clinical lexicon that allows clinicians to describe physical exam findings to one another. To provide a medical dataset densely annotated by domain experts with annotations useful across multiple disease processes, we developed SkinCon: a skin disease dataset densely annotated by dermatologists. SkinCon includes 3230 images from the Fitzpatrick 17k dataset densely annotated with 48 clinical concepts, 22 of which have at least 50 images representing the concept. The concepts used were chosen by two dermatologists considering the clinical descriptor terms used to describe skin lesions. Examples include "plaque", "scale", and "erosion". The same concepts were also used to label 656 skin disease images from the Diverse Dermatology Images dataset, providing an additional external dataset with diverse skin tone representations. We review the potential applications for the SkinCon dataset, such as probing models, concept-based explanations, and concept bottlenecks. Furthermore, we use SkinCon to demonstrate two of these use cases: debugging mistakes of an existing dermatology AI model with concepts and developing interpretable models with post-hoc concept bottleneck models.

翻译：对于在高风险环境(如医疗保健)部署人工智能(AI)而言,提供可解释性/可解释性或允许细化误差分析的方法非常关键。许多最近的可解释性/可解释性方法和细化误差分析方法都使用概念,这些概念是对人类具有地震意义的元标签。然而,只有少数数据集包含概念级元标签和大多数这些元标签,与不需要域内专门知识的自然图像相关。医学中以与单一疾病(如米兰瘤等)相关的元标签应用为主的附加说明的数据集十分关键。在皮肤学中,皮肤病使用固定的临床词汇表,让临床医生能够相互描述体格检查结果。为了提供由具有多种疾病过程说明作用的域专家提供密集附加说明的医学数据集,我们开发了皮肤病模型,由皮肤学中的额外模型组成。SkinCon包含来自菲茨帕特里克 17k 数据集的3230个图象,由48个临床图象进行注解。在皮肤学中, 皮肤病理学中, 22 使用模型用于最低的皮肤模型。