With the widespread use of social networks, detecting the topics discussed in these networks has become a significant challenge. The current works are mainly based on frequent pattern mining or semantic relations, and the language structure is not considered. The meaning of language structural methods is to discover the relationship between words and how humans understand them. Therefore, this paper uses the Concept of the Imitation of the Mental Ability of Word Association to propose a topic detection framework in social networks. This framework is based on the Human Word Association method. The performance of this method is evaluated on the FA-CUP dataset. It is a benchmark dataset in the field of topic detection. The results show that the proposed method is a good improvement compared to other methods, based on the Topic-recall and the keyword F1 measure. Also, most of the previous works in the field of topic detection are limited to the English language, and the Persian language, especially microblogs written in this language, is considered a low-resource language. Therefore, a data set of Telegram posts in the Farsi language has been collected. Applying the proposed method to this dataset also shows that this method works better than other topic detection methods.
翻译:由于广泛使用社交网络,发现这些网络中讨论的主题已成为一项重大挑战。目前的工作主要基于频繁的模式挖掘或语义关系,语言结构没有被考虑。语言结构方法的含义是发现文字之间的关系和人类如何理解它们。因此,本文件使用“Word协会心理能力模仿概念”来提议社交网络中的专题探测框架。这个框架以“人文协会”方法为基础。这个方法的性能在FA-CUP数据集中进行了评估。它是专题探测领域的基准数据集。结果显示,与基于专题回调和关键词F1计量的其他方法相比,拟议方法是一个良好的改进。此外,先前在专题探测领域的大部分工作仅限于英语,波斯语,特别是以该语言撰写的微博被视为一种低资源语言。因此,已经收集了一套法西语的Telegram 站的数据集。将拟议方法应用于该数据集还表明,该方法比其他专题探测方法效果更好。