A lot of news sources picked up on Typhoon Rai (also known locally as Typhoon Odette), along with fake news outlets. The study honed in on the issue, to create a model that can identify between legitimate and illegitimate news articles. With this in mind, we chose the following machine learning algorithms in our development: Logistic Regression, Random Forest and Multinomial Naive Bayes. Bag of Words, TF-IDF and Lemmatization were implemented in the Model. Gathering 160 datasets from legitimate and illegitimate sources, the machine learning was trained and tested. By combining all the machine learning techniques, the Combined BOW model was able to reach an accuracy of 91.07%, precision of 88.33%, recall of 94.64%, and F1 score of 91.38% and Combined TF-IDF model was able to reach an accuracy of 91.18%, precision of 86.89%, recall of 94.64%, and F1 score of 90.60%.
翻译:在台风Rai(又名Typhoon Odette)上,以及假新闻站收集了大量新闻来源。 研究报告对这个问题进行了研究, 以创建一种能够辨别合法和非法新闻文章的模型。 考虑到这一点, 我们在我们的发展中选择了以下机器学习算法: 后勤回归、 随机森林和多小型蜂巢; 在模型中安装了一袋文字、 TF- IDF 和 Lemamatization 。 从合法和非法来源收集160个数据集, 机器学习经过培训和测试。 通过将所有机器学习技术结合起来, 联合BOW 模型能够达到91.07%的精确度, 精确度为88.33%, 记得94.64%, F1 得分为91.38%, 联合TF- IDF 模型能够达到91.18%的精确度, 精确度为86.89%, 记得94.64%, F1得分为90.60%。