For extracting meaningful topics from texts, their structures should be considered properly. In this paper, we aim to analyze structured time-series documents such as a collection of news articles and a series of scientific papers, wherein topics evolve along time depending on multiple topics in the past and are also related to each other at each time. To this end, we propose a dynamic and static topic model, which simultaneously considers the dynamic structures of the temporal topic evolution and the static structures of the topic hierarchy at each time. We show the results of experiments on collections of scientific papers, in which the proposed method outperformed conventional models. Moreover, we show an example of extracted topic structures, which we found helpful for analyzing research activities.
翻译:为了从文本中提取有意义的专题,应当适当地考虑其结构。在本文件中,我们旨在分析结构化的时间系列文件,如新闻文章汇编和一系列科学论文,其中专题随着时间的演变而变化,取决于过去多个专题,而且彼此相互关联。为此,我们提出了一个动态和静态的专题模型,同时考虑时间专题演变的动态结构以及每次专题等级结构的静态结构。我们展示了科学论文收集实验的结果,其中拟议的方法优于常规模型。此外,我们展示了一个提取的专题结构的例子,我们认为它有助于分析研究活动。