Although, recent advances in neural network models for coreference resolution have led to substantial improvements on benchmark datasets, it remains a challenge to successfully transfer those models to new target domains containing many out-of-vocabulary spans and requiring differing annotation schemes. Typical approaches for domain adaptation involve continued training on coreference annotations in the target domain, but obtaining those annotations is costly and time-consuming. In this work, we show that adapting mention detection is the key component to successful domain adaptation of coreference models, rather than antecedent linking. Through timed annotation experiments, we also show annotating mentions alone is nearly twice as fast as annotating full coreference chains. Based on these insights, we propose a method for effectively adapting coreference models that requires only mention annotations in the target domain. We use an auxiliary mention detection objective trained with mention examples in the target domain resulting in higher mention precision. We demonstrate that our approach facilitates sample- and time-efficient transfer to new annotation schemes and lexicons in extensive evaluation across three English coreference datasets: CoNLL-2012 (news/conversation), i2b2/VA (medical case notes), and a dataset of child welfare case notes. We show that annotating mentions results in 7-14% improvement in average F1 over annotating coreference over an equivalent amount of time.
翻译:虽然最近神经网络模型的升级导致基准数据集的大幅改进,但将这些模型成功转移到包含许多校外空域和需要不同批注计划的新目标领域仍是一项挑战。典型的域适应方法包括继续目标领域关于共同参考说明的培训,但获取这些说明需要花费大量时间。在这项工作中,我们表明,调适提及探测是成功调整共同参考模型领域的关键组成部分,而不是预知链接。通过时间说明实验,我们还显示仅提及这些模型是说明全部共同参照链近乎两倍。基于这些见解,我们提出了有效调整共同参照模型的方法,仅需要提及目标领域的说明。我们使用辅助性提及探测目标领域经过培训并提及实例的探测目标,从而更精确地提及这些实例。我们表明,我们的方法有助于将样本和时间高效地转让给新的共同参考模型和在三个英文参考数据集(CONLL-2012 (news/conversation), 仅说明整个共同参照链链的快近两倍。基于这些见解,我们提出了有效调整共同参照模型模式的方法,只需要在目标领域提及说明说明。我们使用了参考基准区区域中提及一个比重的数据。我们比重。