To promote and further develop RST-style discourse parsing models, we need a strong baseline that can be regarded as a reference for reporting reliable experimental results. This paper explores a strong baseline by integrating existing simple parsing strategies, top-down and bottom-up, with various transformer-based pre-trained language models. The experimental results obtained from two benchmark datasets demonstrate that the parsing performance strongly relies on the pretrained language models rather than the parsing strategies. In particular, the bottom-up parser achieves large performance gains compared to the current best parser when employing DeBERTa. We further reveal that language models with a span-masking scheme especially boost the parsing performance through our analysis within intra- and multi-sentential parsing, and nuclearity prediction.
翻译:为了推广和进一步发展RST式对话分解模式,我们需要一个强有力的基线,作为报告可靠实验结果的参考。本文探讨了一个强有力的基线,将现有的简单分解战略、自上而下和自下而上的战略,与各种基于变压器的预先培训语言模型相结合。从两个基准数据集中获得的实验结果显示,分解性能在很大程度上依赖于预先培训的语言模型,而不是分解性战略。特别是,与目前采用DeBERTA时的最佳分解者相比,自下而上的分辨者取得了很大的绩效收益。我们进一步揭示,具有跨线制式语言模型的语文模型通过我们在内部和多级对等内部和多级对等中分析以及核化预测中的分析,特别促进了分辨性能。