Nowadays, data analysis has become a problem as the amount of data is constantly increasing. In order to overcome this problem in textual data, many models and methods are used in natural language processing. The topic modeling field is one of these methods. Topic modeling allows determining the semantic structure of a text document. Latent Dirichlet Allocation (LDA) is the most common method among topic modeling methods. In this article, the proposed n-stage LDA method, which can enable the LDA method to be used more effectively, is explained in detail. The positive effect of the method has been demonstrated by the applied English and Turkish studies. Since the method focuses on reducing the word count in the dictionary, it can be used language-independently. You can access the open-source code of the method and the example: https://github.com/anil1055/n-stage_LDA
翻译:目前,数据分析已成为一个问题,因为数据数量在不断增加。为了克服文字数据中的这个问题,自然语言处理中使用了许多模型和方法。主题建模领域是这种方法之一。主题建模可以确定文本文档的语义结构。LDA(LDA)是专题建模方法中最常见的方法。在本条中,详细解释了能够更有效地使用LDA方法的拟议n阶段LDA方法。该方法的积极效果已经通过应用的英语和土耳其语研究得到证明。由于该方法侧重于减少字典中的文字计数,因此可以独立使用语言。你可以访问该方法的开放源代码和实例:https://github.com/anil1055n-stage_LDA。您可以访问该方法的开放源代码,例如:https://github. com/anil1055n-stage_LDA。