Threat actors continue to exploit geopolitical and global public events launch aggressive campaigns propagating disinformation over the Internet. In this paper we extend our prior research in detecting disinformation using psycholinguistic and computational linguistic processes linked to deception and cybercrime to gain an understanding of the features impact the predictive outcome of machine learning models. In this paper we attempt to determine patterns of deception in disinformation in hybrid models trained on disinformation and scams, fake positive and negative online reviews, or fraud using the eXtreme Gradient Boosting machine learning algorithm. Four hybrid models are generated which are models trained on disinformation and fraud (DIS+EN), disinformation and scams (DIS+FB), disinformation and favorable fake reviews (DIS+POS) and disinformation and unfavorable fake reviews (DIS+NEG). The four hybrid models detected deception and disinformation with predictive accuracies ranging from 75% to 85%. The outcome of the models was evaluated with SHAP to determine the impact of the features.
翻译:暂无翻译