State-of-the-art text simplification (TS) systems adopt end-to-end neural network models to directly generate the simplified version of the input text, and usually function as a blackbox. Moreover, TS is usually treated as an all-purpose generic task under the assumption of homogeneity, where the same simplification is suitable for all. In recent years, however, there has been increasing recognition of the need to adapt the simplification techniques to the specific needs of different target groups. In this work, we aim to advance current research on explainable and controllable TS in two ways: First, building on recently proposed work to increase the transparency of TS systems, we use a large set of (psycho-)linguistic features in combination with pre-trained language models to improve explainable complexity prediction. Second, based on the results of this preliminary task, we extend a state-of-the-art Seq2Seq TS model, ACCESS, to enable explicit control of ten attributes. The results of experiments show (1) that our approach improves the performance of state-of-the-art models for predicting explainable complexity and (2) that explicitly conditioning the Seq2Seq model on ten attributes leads to a significant improvement in performance in both within-domain and out-of-domain settings.
翻译:最先进的简化文本(TS)系统采用端到端神经网络模型,直接生成简化输入文本的简化版本,通常发挥黑盒的作用。此外,在假定同质的情况下,通常将TS视为一种全目的通用任务,因为同一简化对所有人都适用。然而,近年来,人们日益认识到需要使简化技术适应不同目标群体的具体需要。在这项工作中,我们的目标是以两种方式推进目前关于可解释和控制的TS的研究:第一,以最近提议的工作为基础,提高TS系统的透明度,我们使用大量(心理)语言特征,结合经过事先训练的语言模型,改进可解释的复杂程度预测。第二,根据这一初步任务的结果,我们推广了最先进的Sseq2Seqeq TS模型(ACESS),以便对十个属性进行明确的控制。实验结果显示:(1)我们的方法改进了用于预测可解释性能复杂性和可明显改进的Sqeq2质量的状态模型的性能。(2)明确调整了Sqeq2等内部重大性能。