Automatic summarization is the process of shortening a set of textual data computationally, to create a subset (a summary) that represents the most important pieces of information in the original text. Existing summarization methods can be roughly divided into two types: extractive and abstractive. An extractive summarizer explicitly selects text snippets (words, phrases, sentences, etc.) from the source document, while an abstractive summarizer generates novel text snippets to convey the most salient concepts prevalent in the source.
翻译:自动总和是计算缩短一套文本数据的过程,以创建代表原始文本中最重要的信息部分的子集(摘要)。现有的汇总方法可以大致分为两类:采掘和抽象。采掘摘要器从源文档中明确选择文本片段(词、短语、句子等),而抽象摘要器则生成新的文本片段,以传达源代码中普遍存在的最突出的概念。