In this paper, we present a Linguistic Informed Multi-Task BERT (LIMIT-BERT) for learning language representations across multiple linguistic tasks by Multi-Task Learning (MTL). LIMIT-BERT includes five key linguistic syntax and semantics tasks: Part-Of-Speech (POS) tags, constituent and dependency syntactic parsing, span and dependency semantic role labeling (SRL). Besides, LIMIT-BERT adopts linguistics mask strategy: Syntactic and Semantic Phrase Masking which mask all of the tokens corresponding to a syntactic/semantic phrase. Different from recent Multi-Task Deep Neural Networks (MT-DNN) (Liu et al., 2019), our LIMIT-BERT is linguistically motivated and learning in a semi-supervised method which provides large amounts of linguistic-task data as same as BERT learning corpus. As a result, LIMIT-BERT not only improves linguistic tasks performance but also benefits from a regularization effect and linguistic information that leads to more general representations to help adapt to new tasks and domains. LIMIT-BERT obtains new state-of-the-art or competitive results on both span and dependency semantic parsing on Propbank benchmarks and both dependency and constituent syntactic parsing on Penn Treebank.
翻译:在本文中,我们介绍了多语言学习(MTL)在多种语言任务中学习语言表现的多语言多语言多语言模块(LIMIT-BERT)。 LIMIT-BERT包括五项关键语言语义学和语义学任务:部分语义标记、构成和依赖语义综合分类、跨度和依赖语义作用标签(SRL)。此外,LIMIT-BERT还采用了语言遮掩战略:协同和语义遮掩,掩盖了与综合/语义对应的所有标语。与最近的多语言深神经网络(MT-DNN)不同(Liu等人,2019年),我们的LIMIT-BERT具有语言动机,在半监督方法中学习与BERT学习机制一样大量语言任务数据。结果是,LIMIT-BERT不仅改进语言任务的业绩,而且还从正规化效果和语言信息中获益,从而导致更一般的多语言信息,有助于IMT-IMT-IMT基础和亚行新领域和亚性基准,有助于适应新领域和新领域和新数据库。