Much of modern-day text simplification research focuses on sentence-level simplification, transforming original, more complex sentences into simplified versions. However, adding content can often be useful when difficult concepts and reasoning need to be explained. In this work, we present the first data-driven study of content addition in text simplification, which we call elaborative simplification. We introduce a new annotated dataset of 1.3K instances of elaborative simplification in the Newsela corpus, and analyze how entities, ideas, and concepts are elaborated through the lens of contextual specificity. We establish baselines for elaboration generation using large-scale pre-trained language models, and demonstrate that considering contextual specificity during generation can improve performance. Our results illustrate the complexities of elaborative simplification, suggesting many interesting directions for future work.
翻译:现代文本简化研究大多侧重于句级简化,将原有的、更复杂的句子转换为简化版本,然而,在需要解释困难的概念和推理时,增加内容往往有用。在这项工作中,我们首次介绍了以数据驱动的文本简化内容添加研究,我们称之为简化。我们在《Newseela文集》中引入了一套新的附加说明的1.3K种简化简化简化情况数据集,并分析了如何从具体背景的角度来阐述实体、想法和概念。我们建立了使用大规模预先培训的语言模型进行详细拟订的基线,并表明在生成过程中考虑背景特点可以改善绩效。我们的成果说明了促进简化的复杂性,为未来工作提出了许多有趣的方向。