Early diagnosis plays a key role in prevention and treatment of skin cancer.Several machine learning techniques for accurate classification of skin cancer from medical images have been reported. Many of these techniques are based on pre-trained convolutional neural networks (CNNs), which enable training the models based on limited amounts of training data. However, the classification accuracy of these models still tends to be severely limited by the scarcity of representative images from malignant tumours. We propose a novel ensemble-based CNN architecture where multiple CNN models, some of which are pre-trained and some are trained only on the data at hand, along with patient information (meta-data) are combined using a meta-learner. The proposed approach improves the model's ability to handle scarce, imbalanced data. We demonstrate the benefits of the proposed technique using a dataset with 33126 dermoscopic images from 2000 patients.We evaluate the performance of the proposed technique in terms of the F1-measure, area under the ROC curve (AUC-ROC), and area under the PR curve (AUC-PR), and compare it with that of seven different benchmark methods, including two recent CNN-based techniques. The proposed technique achieves superior performance in terms of all the evaluation metrics (F1-measure $0.53$, AUC-PR $0.58$, AUC-ROC $0.97$).
翻译:早期诊断在防治皮肤癌方面发挥着关键作用。报告了从医疗图像中准确分类皮肤癌的数种机器学习技术,其中许多技术都是基于预先培训的神经神经网络(CNNs),能够根据数量有限的培训数据对模型进行培训。然而,这些模型的分类准确性仍然由于恶性肿瘤具有代表性的图像很少而受到严重限制。我们建议建立一个新型的以共同点为基础的CNN结构,其中多个CNN模型(其中一些是预先培训的,有些仅对手头的数据进行了培训)与患者信息(元数据)相结合,同时使用元Learner(元数据),拟议的方法提高了模型处理稀缺、不平衡数据的能力。我们展示了使用具有2000年病人33126德温相图像的数据集的拟议技术的好处。我们从F1计量、ROC曲线(AUSC-ROC)和PR曲线下区域(AUC-PRO-P)下的拟议技术的性能,并与7种不同基准方法进行比较,包括最近以CNN-RRRA为基准的A3,所有以10美元为基准的技术。我们评估了拟议的两种方法。