Discontinuous constituent parsers have always lagged behind continuous approaches in terms of accuracy and speed, as the presence of constituents with discontinuous yield introduces extra complexity to the task. However, a discontinuous tree can be converted into a continuous variant by reordering tokens. Based on that, we propose to reduce discontinuous parsing to a continuous problem, which can then be directly solved by any off-the-shelf continuous parser. To that end, we develop a Pointer Network capable of accurately generating the continuous token arrangement for a given input sentence and define a bijective function to recover the original order. Experiments on the main benchmarks with two continuous parsers prove that our approach is on par in accuracy with purely discontinuous state-of-the-art algorithms, but considerably faster.
翻译:在准确性和速度方面,不连续的分解器总是落后于连续的方法,因为有不连续收成的成分的存在给任务带来额外的复杂性。 但是,不连续的树可以通过重新排序符号转换成一个连续的变体。 在此基础上,我们提议减少不连续的分解到一个连续的问题,然后可以通过任何现成连续的分解器直接解决。 为此,我们开发了一个指针网络,能够准确生成对特定输入句的连续象征性安排,并定义一个双向函数以恢复原始顺序。 用两个连续的分解器对主要基准进行的实验证明我们的方法与纯粹不连续的状态算法一样精确,但速度要快得多。