Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse -- but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the different models, there are substantial differences in which layers best capture discourse information, and large disparities between models.
翻译:在本文中,我们引入了文件层面的讨论,以评价预先培训的语言模块获取文件层面关系的能力。我们实验了7个经过培训的语言模块、4种语言和7种对话演示任务,发现BART总体上是捕捉话语的最佳模式 -- -- 但只是在其编码器中,BERT表现出奇异乎寻常,基线模型也不同。在不同的模型中,各层之间有很大差异,各层之间最能捕捉话语信息,各模式之间也有很大差异。