In implicit discourse relation classification, we want to predict the relation between adjacent sentences in the absence of any overt discourse connectives. This is challenging even for humans, leading to shortage of annotated data, a fact that makes the task even more difficult for supervised machine learning approaches. In the current study, we perform implicit discourse relation classification without relying on any labeled implicit relation. We sidestep the lack of data through explicitation of implicit relations to reduce the task to two sub-problems: language modeling and explicit discourse relation classification, a much easier problem. Our experimental results show that this method can even marginally outperform the state-of-the-art, in spite of being much simpler than alternative models of comparable performance. Moreover, we show that the achieved performance is robust across domains as suggested by the zero-shot experiments on a completely different domain. This indicates that recent advances in language modeling have made language models sufficiently good at capturing inter-sentence relations without the help of explicit discourse markers.
翻译:在隐性话语关系分类中,我们想预测在没有任何公开话语连接器的情况下相邻的句子之间的关系。这甚至对人类来说也是个挑战,导致缺少附加说明的数据,这使得监督机器学习方法的任务更加困难。在目前的研究中,我们进行隐性话语关系分类,而没有依赖任何标签的隐性关系。我们通过明确隐性关系来避免缺乏数据,将任务减少到两个子问题:语言建模和明确话语关系分类,这是一个容易得多的问题。我们的实验结果表明,尽管这种方法比其他类似性能模式简单得多,但甚至可以略微超过最新水平。此外,我们表明,正如在一个完全不同的领域零点实验所显示的那样,所实现的成绩在各个领域是强劲的。这表明,语言建模的最近的进展使得语言模型在没有明确话语标记帮助的情况下,在捕捉同理关系方面十分出色。