In early 2020, the Corona Virus Disease 2019 (COVID-19) epidemic swept the world. In China, COVID-19 has caused severe consequences. Moreover, online rumors during COVID-19 epidemic increased people's panic about public health and social stability. Understanding and curbing the spread of online rumor is an urgent task at present. Therefore, we analyzed the rumor spread mechanism and proposed a method to quantify the rumor influence by the speed of new insiders. We use the search frequency of rumor as an observation variable of new insiders. We calculated the peak coefficient and attenuation coefficient for the search frequency, which conform to the exponential distribution. Then we designed several rumor features and used the above two coefficients as predictable labels. The 5-fold cross-validation experiment using MSE as the loss function shows that the decision tree is suitable for predicting the peak coefficient, and the linear regression model is ideal for predicting the attenuation coefficient. Our feature analysis shows that precursor features are the most important for the outbreak coefficient, while location information and rumor entity information are the most important for the attenuation coefficient. Meanwhile, features which are conducive to the outbreak are usually harmful to the continued spread of rumors. At the same time, anxiety is a crucial rumor-causing factor. Finally, we discussed how to use deep learning technology to reduce forecast loss by use BERT model.
翻译:2020年初,科罗纳病毒2019(COVID-19)流行病席卷了全世界。在中国,科维D-19(COVID-19)流行病引发了严重后果。此外,科维D-19(COVID-19)流行病期间的在线传闻增加了人们对公共健康和社会稳定的恐慌。了解和遏制在线传闻的传播是当前一项紧迫的任务。因此,我们分析了传闻传播机制,并提出了用新内幕者速度来量化传闻影响的方法。我们使用传闻的搜索频率作为新内幕者的观察变量。我们计算了与指数分布相符的搜索频率的峰值系数和衰减系数。然后,我们设计了一些传闻特征,并将以上两个系数用作可预测的标签。以MSE为损失函数的5倍交叉校验实验表明,决策树适合预测峰值,线性回归模型是预测衰减系数的理想。我们的特征分析显示,先质特征对于爆发系数来说最为重要,而定位和流传实体信息对降系数最为重要。同时,我们设计了几个传闻特征,然后将以上两个系数作为可预测的标签。与此同时,我们讨论了如何利用关键的传言来预测。我们是如何利用坏坏感因素来预测。我们如何利用了坏感学。