Text Simplification (TS) aims to reduce the linguistic complexity of content to make it easier to understand. Research in TS has been of keen interest, especially as approaches to TS have shifted from manual, hand-crafted rules to automated simplification. This survey seeks to provide a comprehensive overview of TS, including a brief description of earlier approaches used, discussion of various aspects of simplification (lexical, semantic and syntactic), and latest techniques being utilized in the field. We note that the research in the field has clearly shifted towards utilizing deep learning techniques to perform TS, with a specific focus on developing solutions to combat the lack of data available for simplification. We also include a discussion of datasets and evaluations metrics commonly used, along with discussion of related fields within Natural Language Processing (NLP), like semantic similarity.
翻译:文本简化(TS)旨在减少内容的语言复杂性,使其更易于理解。对TS的研究一直引起极大的兴趣,特别是因为TS的方法已从手工、手工制定的规则转向自动化简化。这项调查旨在全面概述TS,包括简要说明以前使用的方法、讨论简化的各个方面(语言、语义和合成)以及外地正在使用的最新技术。我们注意到,实地研究显然已经转向利用深层学习技术进行TS,特别侧重于制定解决办法,解决缺乏可供简化的数据的问题。我们还包括讨论常用的数据集和评价指标,以及讨论语言处理(NLP)中类似语义处理的相关领域。