An accurate and unbiased examination of skin lesions is critical for the early diagnosis and treatment of skin cancers. The visual feature of the skin lesions varies significantly because skin images are collected from patients with different skin colours by using various devices. Recent studies have developed ensembled convolutional neural networks (CNNs) to classify the images for early diagnosis. However, the practical use of CNNs is limited because their network structures are heavyweight and neglect contextual information. Vision transformers (ViTs) learn the global features by self-attention mechanisms, but they also have comparatively large model sizes (more than 100M). To address these limitations, we introduce HierAttn, a lite and effective neural network with hierarchical and self attention. HierAttn applies a novel strategy based on learning local and global features by a multi-stage and hierarchical network. The efficacy of HierAttn was evaluated by using the dermoscopy images dataset ISIC2019 and smartphone photos dataset PAD-UFES-20. The experimental results show that HierAttn achieves the best top-1 accuracy and AUC among state-of-the-art mobile networks, including MobileNetV3 and MobileViT. The code is available at https://github.com/anthonyweidai/HierAttn.
翻译:对皮肤损伤进行准确和公正的检查对于皮肤癌的早期诊断和治疗至关重要。皮肤损伤的视觉特征差异很大,因为通过使用各种装置从有不同肤色的病人那里收集了皮肤图像。最近的一些研究开发了综合进化神经网络(CNNs),对图像进行早期诊断;然而,对CNN的实用使用有限,因为其网络结构重量过重,忽视了背景信息。视觉变压器(VTs)通过自我监控机制了解全球特征,但也有相对较大的规模(100M以上)。为了解决这些局限性,我们引入了HierAttn,这是一个高层次和自觉关注的精密和有效的神经网络。HierAttn应用了一个基于多阶段和等级网络学习本地和全球特征的新战略。HierAttn的功效是通过使用温度镜像数据集ISIC2019和智能语音照片数据集PAD-UFES-20评估的。实验结果表明HierAttn实现了最佳的顶级和有效神经网络,包括移动式/移动式AVA3号。