Machine Learning and Artificial Intelligence can be widely used to diagnose chronic diseases so that necessary precautionary treatment can be done in critical time. Diabetes Mellitus which is one of the major diseases can be easily diagnosed by several Machine Learning algorithms. Early stage diagnosis is crucial to prevent dangerous consequences. In this paper we have made a comparative analysis of several machine learning algorithms viz. Random Forest, Decision Tree, Artificial Neural Networks, K Nearest Neighbor, Support Vector Machine, and XGBoost along with feature attribution using SHAP to identify the most important feature in predicting the diabetes on a dataset collected from Sylhet Hospital. As per the experimental results obtained, the Random Forest algorithm has outperformed all the other algorithms with an accuracy of 99 percent on this particular dataset.
翻译:机器学习和人工智能可以广泛用于诊断慢性病,以便在关键时刻进行必要的预防治疗。 糖尿病是主要疾病之一,可以很容易地由几种机器学习算法诊断。 早期诊断对于预防危险后果至关重要。 在本文中,我们对若干机器学习算法进行了比较分析,如随机森林、决策树、人工神经网络、近距离神经网络、支持矢量机器和XGBoost,同时使用SHAP来识别从Sylhet医院收集的数据集中预测糖尿病的最重要特征。 根据实验结果,随机森林算法已经超过了所有其他算法,精确度为这个特定数据集的99%。