Aiming for a better integration of data-driven and linguistically-inspired approaches, we explore whether RST Nuclearity, assigning a binary assessment of importance between text segments, can be replaced by automatically generated, real-valued scores, in what we call a Weighted-RST framework. In particular, we find that weighted discourse trees from auxiliary tasks can benefit key NLP downstream applications, compared to nuclearity-centered approaches. We further show that real-valued importance distributions partially and interestingly align with the assessment and uncertainty of human annotators.
翻译:为了更好地整合以数据驱动和语言激励的方法,我们探讨是否可以用自动生成的、实际估价的分数取代RST核核问题,对文本部分的重要性进行二进制评估,在我们所谓的加权-RST框架内,我们特别发现,从辅助任务中产生的加权话语树与以核化为中心的方法相比,可以有利于NLP下游关键应用。我们进一步表明,实际估价的重要性分布部分和令人感兴趣的是,与人类批注者的评估和不确定性相一致。