Twitter can be viewed as a data source for Natural Language Processing (NLP) tasks. The continuously updating data streams on Twitter make it challenging to trace real-time topic evolution. In this paper, we propose a framework for modeling fuzzy transitions of topic clusters. We extend our previous work on crisp cluster transitions by incorporating fuzzy logic in order to enrich the underlying structures identified by the framework. We apply the methodology to both computer generated clusters of nouns from tweets and human tweet annotations. The obtained fuzzy transitions are compared with the crisp transitions, on both computer generated clusters and human labeled topic sets.
翻译:Twitter可被视为自然语言处理(NLP)任务的数据源。 不断更新Twitter上的数据流,使得追踪实时专题演变具有挑战性。 在本文中,我们提出了一个模拟专题组群模糊过渡的框架。 我们扩展了先前关于黑板分组过渡的工作,纳入了模糊的逻辑,以丰富框架确定的基本结构。 我们对从推文和人类推文批注中生成的计算机名词组群应用了方法。 获取的模糊过渡与计算机生成的集群和人类标签主题组群的黑板转换进行了比较。