Many natural language processing tasks, e.g., coreference resolution and semantic role labeling, require selecting text spans and making decisions about them. A typical approach to such tasks is to score all possible spans and greedily select spans for task-specific downstream processing. This approach, however, does not incorporate any inductive bias about what sort of spans ought to be selected, e.g., that selected spans tend to be syntactic constituents. In this paper, we propose a novel grammar-based structured span selection model which learns to make use of the partial span-level annotation provided for such problems. Compared to previous approaches, our approach gets rid of the heuristic greedy span selection scheme, allowing us to model the downstream task on an optimal set of spans. We evaluate our model on two popular span prediction tasks: coreference resolution and semantic role labeling. We show empirical improvements on both.
翻译:许多自然语言处理任务,例如,共同参照分辨率和语义作用标签,要求选择文字,并就它们作出决定。这类任务的典型方法是,在任务特定的下游处理中,分到所有可能的跨度,贪婪地选择跨度。然而,这一方法并未包含任何关于应当选择的跨度的暗示偏见,例如,选定跨度往往是综合成分。在本文中,我们提议了一个基于语法的新型结构跨度选择模型,该模型学习如何利用为这些问题提供的局部跨度注释。与以往的方法相比,我们的方法摆脱了过度贪婪的跨度选择计划,使我们能够在最佳的跨度组合上模拟下游任务。我们评估了我们的两个广度预测任务模式:共同参照分辨率和语义作用标签。我们展示了两者的经验改进。