Application of Machine Learning algorithms to the medical domain is an emerging trend that helps to advance medical knowledge. At the same time, there is a significant a lack of explainable studies that promote informed, transparent, and interpretable use of Machine Learning algorithms. In this paper, we present explainable multi-class classification of the Covid-19 mental health data. In Machine Learning study, we aim to find the potential factors to influence a personal mental health during the Covid-19 pandemic. We found that Random Forest (RF) and Gradient Boosting (GB) have scored the highest accuracy of 68.08% and 68.19% respectively, with LIME prediction accuracy 65.5% for RF and 61.8% for GB. We then compare a Post-hoc system (Local Interpretable Model-Agnostic Explanations, or LIME) and an Ante-hoc system (Gini Importance) in their ability to explain the obtained Machine Learning results. To the best of these authors knowledge, our study is the first explainable Machine Learning study of the mental health data collected during Covid-19 pandemics.
翻译:在医学领域应用机器学习算法是一个新出现的趋势,有助于增进医学知识。与此同时,严重缺乏可解释的研究,无法促进在知情、透明和可解释的情况下使用机器学习算法。在本文中,我们介绍了Covid-19心理健康数据可解释的多级分类。在机器学习研究中,我们的目标是找出在Covid-19大流行期间影响个人心理健康的潜在因素。我们发现随机森林(RF)和加速推介(GB)的精确率分别为68.08%和68.19%,而LIME预测R为65.5%,GB为61.8%。我们随后比较了后热系统(地方模型-分析解释,或LIME)和Ante-hoc系统(Gini Enority),以了解他们解释获得的机器学习结果的能力。我们的研究是首次对Covid-19大流行期间所收集的心理健康数据进行可解释的机器学习研究。