Non-small cell lung cancer (NSCLC) is a serious disease and has a high recurrence rate after the surgery. Recently, many machine learning methods have been proposed for recurrence prediction. The methods using gene data have high prediction accuracy but require high cost. Although the radiomics signatures using only CT image are not expensive, its accuracy is relatively low. In this paper, we propose a genotype-guided radiomics method (GGR) for obtaining high prediction accuracy with low cost. We used a public radiogenomics dataset of NSCLC, which includes CT images and gene data. The proposed method is a two-step method, which consists of two models. The first model is a gene estimation model, which is used to estimate the gene expression from radiomics features and deep features extracted from computer tomography (CT) image. The second model is used to predict the recurrence using the estimated gene expression data. The proposed GGR method designed based on hybrid features which is combination of handcrafted-based and deep learning-based. The experiments demonstrated that the prediction accuracy can be improved significantly from 78.61% (existing radiomics method) and 79.14% (deep learning method) to 83.28% by the proposed GGR.
翻译:非小型细胞肺癌(NSCLC)是一种严重的疾病,手术后复发率很高。最近,提出了许多机器学习方法,以进行复发预测。使用基因数据的方法具有很高的预测准确性,但需要高成本。虽然仅使用CT图像的放射信号并不昂贵,但其准确性相对较低。在本文中,我们提出了一种基因型导导导放射法(GGGR),以获得高预测精确度,成本较低。我们使用了NSCLC的公共放射基因组数据集,其中包括CT图像和基因数据。提议的方法是一种两步方法,由两种模型组成。第一个模型是基因估计模型,用来估计从放射特征和从计算机到摄影(CT)图像中提取的深度特征的基因表达方式。第二个模型用来预测复发情况,使用估计基因表达数据。拟议的GGGGR方法是以人工制作的和深层次学习的混合特征为基础设计的。实验表明,预测准确性可以大大改进,从78.61%(现有放射法)到83.14%(按GMR学习的方法)。