This research presents ORUGA, a method that tries to automatically optimize the readability of any text in English. The core idea behind the method is that certain factors affect the readability of a text, some of which are quantifiable (number of words, syllables, presence or absence of adverbs, and so on). The nature of these factors allows us to implement a genetic learning strategy to replace some existing words with their most suitable synonyms to facilitate optimization. In addition, this research seeks to preserve both the original text's content and form through multi-objective optimization techniques. In this way, neither the text's syntactic structure nor the semantic content of the original message is significantly distorted. An exhaustive study on a substantial number and diversity of texts confirms that our method was able to optimize the degree of readability in all cases without significantly altering their form or meaning. The source code of this approach is available at https://github.com/jorge-martinez-gil/oruga.
翻译:这项研究提出了ORUGA, 这是一种试图自动优化任何英文文本可读性的方法。 方法背后的核心思想是,某些因素影响到文本的可读性,有些是可量化的(单词数、音频、存在或没有副词等等)。 这些因素的性质使我们能够实施基因学习战略,用最合适的同义词取代某些现有词,以促进优化。此外,这项研究还试图通过多目标优化技术来保存原始文本的内容和形式。 这样,文本的合成结构或原始电文的语义内容都没有被大大扭曲。关于大量和多样文本的详尽研究证实,我们的方法能够在所有情况下优化可读性的程度,而不会显著改变其形式或含义。 这种方法的来源代码可在https://github.com/joge-martinez-gil/oruga查阅。