Text Simplification (TS) is the task of converting a text into a form that is easier to read while maintaining the meaning of the original text. A sub-task of TS is Cognitive Simplification (CS), converting text to a form that is readily understood by people with cognitive disabilities without rendering it childish or simplistic. This sub-task has yet to be explored with neural methods in NLP, and resources for it are scarcely available. In this paper, we present a method for incorporating knowledge from the cognitive accessibility domain into a TS model, by introducing an inductive bias regarding what simplification operations to use. We show that by adding this inductive bias to a TS-trained model, it is able to adapt better to CS without ever seeing CS data, and outperform a baseline model on a traditional TS benchmark. In addition, we provide a novel test dataset for CS, and analyze the differences between CS corpora and existing TS corpora, in terms of how simplification operations are applied.
翻译:文本简化 (TS) 是将文本转换成一种在保持原文本含义的同时更容易阅读的形式的任务。 TS 的子任务是认知简化(CS), 将文本转换成一种容易为认知残疾者理解的形式, 而不会使其变得幼稚或简单化。 这个子任务尚未在 NLP 中以神经方法进行探索, 其资源极少。 在本文中, 我们提出了一个将认知可获取性领域的知识纳入TS 模式的方法, 引入关于简化操作的诱导偏见 。 我们表明, 通过在TS 培训模型中添加这种暗示偏差, 它可以在从未看到 CS 数据的情况下更好地适应 CS, 并且超越传统 TS 基准的基准模型。 此外, 我们为 CS 提供了一个全新的测试数据集, 并分析 CS CS Corpora 和 现有的 TS Cororora 之间的差别, 是如何应用简化操作的 。