Heart Disease has become one of the most serious diseases that has a significant impact on human life. It has emerged as one of the leading causes of mortality among the people across the globe during the last decade. In order to prevent patients from further damage, an accurate diagnosis of heart disease on time is an essential factor. Recently we have seen the usage of non-invasive medical procedures, such as artificial intelligence-based techniques in the field of medical. Specially machine learning employs several algorithms and techniques that are widely used and are highly useful in accurately diagnosing the heart disease with less amount of time. However, the prediction of heart disease is not an easy task. The increasing size of medical datasets has made it a complicated task for practitioners to understand the complex feature relations and make disease predictions. Accordingly, the aim of this research is to identify the most important risk-factors from a highly dimensional dataset which helps in the accurate classification of heart disease with less complications. For a broader analysis, we have used two heart disease datasets with various medical features. The classification results of the benchmarked models proved that there is a high impact of relevant features on the classification accuracy. Even with a reduced number of features, the performance of the classification models improved significantly with a reduced training time as compared with models trained on full feature set.
翻译:心脏病已成为对人类生活有重大影响的最严重疾病之一,在过去十年中,它已成为全球人民死亡的主要原因之一。为了防止病人进一步受损,及时准确诊断心脏病是一个基本因素。最近,我们看到使用非侵入性医疗程序,如医学领域的人工智能技术。特别机器学习采用多种算法和技术,这些算法和技术被广泛使用,并且非常有助于以较少时间准确诊断心脏病。然而,预测心脏病并非易事。医学数据集的日益扩大使得开业者了解复杂特征关系和作出疾病预测是一项复杂的任务。因此,这项研究的目的是从高度维度数据集中找出最重要的风险因素,这有助于对复杂程度较低的心脏病进行准确分类。为了进行更广泛的分析,我们使用了两种具有不同医学特征的心脏病数据集。基准模型的分类结果证明,对相关特性的高度影响是经过培训的精确性特征的完整改进,与经过培训的精确性特征相比,其精确性能的特性也随之降低。