Sentence simplification tends to focus on the generic simplification of sentences by making them more readable and easier to understand. This paper provides a dataset aimed at training models that perform subject aware sentence simplifications rather than simplifying sentences as a whole. We also test models on that dataset which are inspired by model architecture used in abstractive summarization. We hand generated portions of the data and augment the dataset by further manipulating those hand written simplifications. Our results show that data-augmentation, data-masking, and model architecture choices used in summarization provide a solid baseline for comparison on subject aware simplification.
翻译:句子简化往往侧重于将句子整体简化,以使其更易读懂。本文提供了一个数据集,旨在训练模型以执行主语感知的句子简化,而不是简化整个句子。我们还在该数据集上测试了受抽象摘要模型架构启发的模型。我们手工生成了数据的一部分,并通过进一步操作这些手写简化来增强数据集。结果表明,摘要中使用的数据增强、数据遮蔽和模型架构选择为主体感知简化提供了一个可靠的基线。