In this research, we have two serum SELDI (surface-enhanced laser desorption and ionization) mass spectra (MS) datasets to be used to select features amongst them to identify proteomic cancerous serums from normal serums. Features selection techniques have been applied and classification techniques have been applied as well. Amongst the features selection techniques we have chosen to evaluate the performance of PCA (Principal Component Analysis ) and GA (Genetic algorithm), and amongst the classification techniques we have chosen the LDA (Linear Discriminant Analysis) and Neural networks so as to evaluate the ability of the selected features in identifying the cancerous patterns. Results were obtained for two combinations of features selection techniques and classification techniques, the first one was PCA+(t-test) technique for features selection and LDA for accuracy tracking yielded an accuracy of 93.0233 % , the other one was genetic algorithm and neural network yielded an accuracy of 100%. So, we conclude that GA is more efficient for features selection and hence for cancerous patterns detection than PCA technique.
翻译:在这一研究中,我们有两个血清SELDI(地表增强激光解吸和离子化)质谱(MS)数据集,用于在其中选择特征,从正常血清中确定蛋白质癌血清;应用了特征选择技术,还应用了分类技术;我们选择了两种特征选择技术,以评价五氯苯甲醚(主要成分分析)和GA(基因算法)的性能;在分类技术中,我们选择了LDA(激光分辨分析)和神经网络,以评价选定特征在确定癌症模式方面的能力;对特征选择技术和分类技术的两种组合取得了结果,第一个是特征选择和精度跟踪LDA(测试)技术,其精度为93.0233%;另一个是遗传算法和神经网络,其精度为100%。因此,我们认为GA在特征选择和癌症模式检测方面比五氯苯甲醚技术更有效率。