The Natural Language Processing task of determining "Who did what to whom" is called Semantic Role Labeling. For English, recent methods based on Transformer models have allowed for major improvements in this task over the previous state of the art. However, for low resource languages, like Portuguese, currently available semantic role labeling models are hindered by scarce training data. In this paper, we explore a model architecture with only a pre-trained Transformer-based model, a linear layer, softmax and Viterbi decoding. We substantially improve the state-of-the-art performance in Portuguese by over 15 F1. Additionally, we improve semantic role labeling results in Portuguese corpora by exploiting cross-lingual transfer learning using multilingual pre-trained models, and transfer learning from dependency parsing in Portuguese, evaluating the various proposed approaches empirically.
翻译:确定“ 谁做了谁”的自然语言处理任务, 叫做“ 语义角色标签 ” 。 在英语方面, 以变异模型为基础的最新方法使得这一任务比以往的状态有了重大改进。 但是, 对于像葡萄牙语这样的低资源语言, 现有的语义角色标签模式受到稀缺的培训数据的限制。 在本文中, 我们探索一个模型结构, 只有一个以培训前的变异器为基础的模型, 一个线性层, 软体和维泰比解码。 我们大大改进了葡萄牙语的最新表现, 增加了15个F1以上。 此外, 我们通过利用多语言的预先培训模式, 以及从葡萄牙语中的依赖划分中转移学习, 以经验方式评估各种拟议方法, 来改进葡萄牙语公司在语言上的语义角色标签效果。