Most current work in NLP utilizes deep learning, which requires a lot of training data and computational power. This paper investigates the strengths of Genetic Algorithms (GAs) for extractive summarization, as we hypothesized that GAs could construct more efficient solutions for the summarization task due to their relative customizability relative to deep learning models. This is done by building a vocabulary set, the words of which are represented as an array of weights, and optimizing those set of weights with the GA. These weights can be used to build an overall weighting of a sentence, which can then be passed to some threshold for extraction. Our results showed that the GA was able to learn a weight representation that could filter out excessive vocabulary and thus dictate sentence importance based on common English words.
翻译:国家劳工政策委员会目前的大部分工作都利用深层次的学习,这需要大量的培训数据和计算能力。本文件调查了用于采掘总和的遗传性算法(GAs)的长处,因为我们假设,由于GAs相对与深层学习模式相对的自定义性,GAs能够为总和任务制定更有效的解决办法。这是通过建立一套词汇来完成的,其词组以一系列加权形式表示,并优化与GA的这些加权组合。这些加权可以用来构建一个句子的总体加权,然后将其传递到某些提取门槛。我们的结果显示,GA能够学习出一个重量代表,从而可以过滤过量的词汇,从而根据共同的英语语言来决定判决的重要性。