Disruptive technologies provides unparalleled opportunities to contribute to the identifications of many aspects in pervasive healthcare, from the adoption of the Internet of Things through to Machine Learning (ML) techniques. As a powerful tool, ML has been widely applied in patient-centric healthcare solutions. To further improve the quality of patient care, Electronic Health Records (EHRs) are commonly adopted in healthcare facilities for analysis. It is a crucial task to apply AI and ML to analyse those EHRs for prediction and diagnostics due to their highly unstructured, unbalanced, incomplete, and high-dimensional nature. Dimensionality reduction is a common data preprocessing technique to cope with high-dimensional EHR data, which aims to reduce the number of features of EHR representation while improving the performance of the subsequent data analysis, e.g. classification. In this work, an efficient filter-based feature selection method, namely Curvature-based Feature Selection (CFS), is presented. The proposed CFS applied the concept of Menger Curvature to rank the weights of all features in the given data set. The performance of the proposed CFS has been evaluated in four well-known EHR data sets, including Cervical Cancer Risk Factors (CCRFDS), Breast Cancer Coimbra (BCCDS), Breast Tissue (BTDS), and Diabetic Retinopathy Debrecen (DRDDS). The experimental results show that the proposed CFS achieved state-of-the-art performance on the above data sets against conventional PCA and other most recent approaches. The source code of the proposed approach is publicly available at https://github.com/zhemingzuo/CFS.
翻译:为进一步提升患者护理质量,医疗保健设施通常采用电子健康记录(EHRs)进行分析; 应用AI和ML分析这些电子人力资源进行预测和诊断,是一项至关重要的任务,因为其高度结构化、不平衡、不完善和高维性质; 尺寸减少是一种共同的数据处理预处理技术,用于应对高水平的EHR数据,目的是减少EHR代表的特征数量,同时改进随后的数据分析,例如分类; 为进一步改进患者护理质量,在医疗保健设施中通常采用电子健康记录(EHRs),以进行分析; 应用AI和ML分析这些电子人力资源,用于预测和诊断,因为其高度结构化、不平衡、不完整和高维度性质; 尺寸减少是一种共同的数据处理前处理技术,目的是应对高水平的EHR(ML)数据,目的是减少EHR代表的特征数量,同时改进随后的数据分析(例如分类)的绩效; 在这项工作中,介绍了一种高效的基于过滤的特征选择方法,即基于曲线的功能选择; 拟议的CFSFS概念用于确定特定数据集中所有特性的重量; 拟议的CFSFS(CFS-FS-FS)的绩效已在四个众所周知的ERCFS-RS-DRS-DRS-RS-RA中评估了最新的C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-S-C-C-C-C-C-C-S-C-C-C-C-C-C-C-C-CRest-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-CRisalstalstalex-C-C-C-C-C-C-C-C-C-C-C-S-C-S-S-S-S-S-S-S-S-S-C-C-C-S-C-C-C-C-C-S-S