高维数据强有力的自我愈合预测模型 (Robust self-healing prediction model for high dimensional data)

Owing to the advantages of increased accuracy and the potential to detect unseen patterns, provided by data mining techniques they have been widely incorporated for standard classification problems. They have often been used for high precision disease prediction in the medical field, and several hybrid prediction models capable of achieving high accuracies have been proposed. Though this stands true most of the previous models fail to efficiently address the recurring issue of bad data quality which plagues most high dimensional data, and especially proves troublesome in the highly sensitive medical data. This work proposes a robust self healing (RSH) hybrid prediction model which functions by using the data in its entirety by removing errors and inconsistencies from it rather than discarding any data. Initial processing involves data preparation followed by cleansing or scrubbing through context-dependent attribute correction, which ensures that there is no significant loss of relevant information before the feature selection and prediction phases. An ensemble of heterogeneous classifiers, subjected to local boosting, is utilized to build the prediction model and genetic algorithm based wrapper feature selection technique wrapped on the respective classifiers is employed to select the corresponding optimal set of features, which warrant higher accuracy. The proposed method is compared with some of the existing high performing models and the results are analyzed.

翻译：由于数据挖掘技术提供的提高准确性的好处和探测不可见模式的潜力,这些数据被广泛纳入用于标准分类问题,这些技术被广泛用于医疗领域的高精确度疾病预测,并提出了若干能够实现高准确度的混合预测模型。虽然大多数以前模型都是正确的,但未能有效地解决经常出现的数据质量差的问题,这些问题困扰着大多数高维数据,尤其证明高度敏感的医疗数据存在麻烦。这项工作提出了一种强大的自我治愈(RSH)混合预测模型,该模型通过从中消除数据的全部错误和不一致而发挥作用,而不是抛弃任何数据。初步处理涉及在数据准备之后进行清理或清理,然后通过根据具体情况对属性进行校正,从而确保在特征选择和预测阶段之前不会大量丧失相关信息。受当地推动的混合分类器的集合器被用来建立预测模型和遗传算法,根据包装在各分类者身上的包装特征选择技术来选择相应的最佳特征,这需要更高的准确性。拟议的方法与一些高性模型进行比较,并分析结果。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日