The volume of information is increasing at an incredible rate with the rapid development of the Internet and electronic information services. Due to time constraints, we don't have the opportunity to read all this information. Even the task of analyzing textual data related to one field requires a lot of work. The text summarization task helps to solve these problems. This article presents an experiment on summarization task for Uzbek language, the methodology was based on text abstracting based on TF-IDF algorithm. Using this density function, semantically important parts of the text are extracted. We summarize the given text by applying the n-gram method to important parts of the whole text. The authors used a specially handcrafted corpus called "School corpus" to evaluate the performance of the proposed method. The results show that the proposed approach is effective in extracting summaries from Uzbek language text and can potentially be used in various applications such as information retrieval and natural language processing. Overall, this research contributes to the growing body of work on text summarization in under-resourced languages.
翻译:随着互联网和电子信息服务的迅速发展,信息量正在以令人难以置信的速度增加。由于时间的限制,我们没有机会阅读所有这些信息。即使分析与一个领域有关的文本数据的任务也需要大量工作。文本汇总任务有助于解决这些问题。本文章介绍了乌兹别克语总结任务的实验,其方法基于TF-IDF算法的文本抽象。利用这个密度函数,摘录了文本中具有重要意义的部分。我们通过对整个文本的重要部分应用n-gram方法对文本进行了总结。作者们使用一个特别手工制作的称为“学校文具”的文具来评估拟议方法的绩效。结果显示,拟议方法在从乌兹别克语文本中提取摘要方面是有效的,并有可能用于信息检索和自然语言处理等各种应用。总体而言,这项研究有助于在资源不足的语言中增加关于文本汇总的工作。</s>