We present ENHANCE, an open dataset with multiple annotations to complement the existing ISIC and PH2 skin lesion classification datasets. This dataset contains annotations of visual ABC (asymmetry, border, colour) features from non-expert annotation sources: undergraduate students, crowd workers from Amazon MTurk and classic image processing algorithms. In this paper we first analyse the correlations between the annotations and the diagnostic label of the lesion, as well as study the agreement between different annotation sources. Overall we find weak correlations of non-expert annotations with the diagnostic label, and low agreement between different annotation sources. We then study multi-task learning (MTL) with the annotations as additional labels, and show that non-expert annotations can improve (ensembles of) state-of-the-art convolutional neural networks via MTL. We hope that our dataset can be used in further research into multiple annotations and/or MTL. All data and models are available on Github: https://github.com/raumannsr/ENHANCE.
翻译:我们展示了“加强”,这是一个开放的数据集,具有多个附加说明,以补充现有的ISIC和PH2皮肤损伤分类数据集。该数据集包含非专家注解来源的视觉ABC(不对称、边框、颜色)特征说明:本科生、亚马逊MTurk的人群工人和经典图像处理算法。在本文件中,我们首先分析“损害”的注解和诊断标签之间的相互关系,并研究不同注解来源之间的协议。总体而言,我们发现非专家注解与诊断标签和不同注解来源之间的低一致关系薄弱。我们然后研究多任务学习(MTL),将说明作为额外标签,并表明非专家注解可以通过MTL改进(组合)艺术水平的共振动神经网络。我们希望我们的数据集能够用于对多个注解和/或MTL的进一步研究。所有数据和模型都可在Github上查阅:https://github.com/raumannr/ENHENCE。