The methods of automatic speech summarization are classified into two groups: supervised and unsupervised methods. Supervised methods are based on a set of features, while unsupervised methods perform summarization based on a set of rules. Latent Semantic Analysis (LSA) and Maximal Marginal Relevance (MMR) are considered the most important and well-known unsupervised methods in automatic speech summarization. This study set out to investigate the performance of two aforementioned unsupervised methods in transcriptions of Persian broadcast news summarization. The results show that in generic summarization, LSA outperforms MMR, and in query-based summarization, MMR outperforms LSA in broadcast news summarization.
翻译:自动语音摘要方法分为两类:受监管和不受监督的方法; 受监督的方法基于一套特征,而不受监督的方法则基于一套规则进行汇总; 远程语义分析(LSA)和最大边际相关性(MMMR)被认为是自动语音摘要的最重要和众所周知的不受监督的方法; 这项研究旨在调查在波斯语广播新闻摘要抄录中上述两种未经监督的方法的性能。 研究结果显示,在通用合成中,LSA优于MMR,在基于查询的汇总中,MMMR优于广播新闻摘要中的LSA。