In this study, we aim to identify the most effective machine learning model for accurately classifying Myers-Briggs Type Indicator (MBTI) types from Reddit posts and a Kaggle data set. We apply multi-label classification using the Binary Relevance method. We use Explainable Artificial Intelligence (XAI) approach to highlight the transparency and understandability of the process and result. To achieve this, we experiment with glass-box learning models, i.e. models designed for simplicity, transparency, and interpretability. We selected k-Nearest Neighbour, Multinomial Naive Bayes, and Logistic Regression for the glass-box models. We show that Multinomial Naive Bayes and k-Nearest Neighbour perform better if classes with Observer (S) traits are excluded, whereas Logistic Regression obtains its best results when all classes have > 550 entries.
翻译:暂无翻译