Several machine learning techniques for accurate detection of skin cancer from medical images have been reported. Many of these techniques are based on pre-trained convolutional neural networks (CNNs), which enable training the models based on limited amounts of training data. However, the classification accuracy of these models still tends to be severely limited by the scarcity of representative images from malignant tumours. We propose a novel ensemble-based CNN architecture where multiple CNN models, some of which are pre-trained and some are trained only on the data at hand, along with auxiliary data in the form of metadata associated with the input images, are combined using a meta-learner. The proposed approach improves the model's ability to handle limited and imbalanced data. We demonstrate the benefits of the proposed technique using a dataset with 33126 dermoscopic images from 2056 patients. We evaluate the performance of the proposed technique in terms of the F1-measure, area under the ROC curve (AUC-ROC), and area under the PR-curve (AUC-PR), and compare it with that of seven different benchmark methods, including two recent CNN-based techniques. The proposed technique compares favourably in terms of all the evaluation metrics.
翻译:已报告了若干机器学习技术,以便从医疗图像中准确检测皮肤癌,其中许多技术是以经过训练的进化神经网络(CNNs)为基础的,这些技术能够根据有限的培训数据对模型进行培训;然而,这些模型的分类准确性仍然由于恶性肿瘤具有代表性的图像的缺乏而严重受到限制;我们提议建立一个基于全套的新型CNN结构,其中多个CNN模型(其中一些模型经过预先培训,有些模型仅就手头的数据进行培训),以及以输入图像相关元数据形式提供的辅助数据,使用一个元光学元数据相结合;拟议的方法提高了模型处理有限和不平衡数据的能力;我们展示了使用具有来自2056名病人的33126个脱温图像的数据集的拟议技术的效益;我们从F1措施领域、ROC曲线(AOC-ROC)和PR曲线(ARC-PR)下区域(ARC-PR)下的拟议技术的绩效,并把它与包括最近两个CNN技术在内的七个不同基准方法进行比较。