This paper tackles the problem of automatically labelling sentiment-bearing topics with descriptive sentence labels. We propose two approaches to the problem, one extractive and the other abstractive. Both approaches rely on a novel mechanism to automatically learn the relevance of each sentence in a corpus to sentiment-bearing topics extracted from that corpus. The extractive approach uses a sentence ranking algorithm for label selection which for the first time jointly optimises topic--sentence relevance as well as aspect--sentiment co-coverage. The abstractive approach instead addresses aspect--sentiment co-coverage by using sentence fusion to generate a sentential label that includes relevant content from multiple sentences. To our knowledge, we are the first to study the problem of labelling sentiment-bearing topics. Our experimental results on three real-world datasets show that both the extractive and abstractive approaches outperform four strong baselines in terms of facilitating topic understanding and interpretation. In addition, when comparing extractive and abstractive labels, our evaluation shows that our best performing abstractive method is able to provide more topic information coverage in fewer words, at the cost of generating less grammatical labels than the extractive method. We conclude that abstractive methods can effectively synthesise the rich information contained in sentiment-bearing topics.
翻译:本文探讨用描述性句子标签自动贴上带有情感的话题的问题。 我们建议了两种方法,一种是抽取,另一种是抽象的。两种方法都依靠一种新机制来自动了解每句句子与从该文中提取的带有情感的话题的相关性。采掘方法使用一种标签选择的量刑排序算法,首次共同选择专题-感应相关性和侧向共覆盖。抽象方法则通过使用句子混合产生包含多个句子相关内容的感性标签,处理侧向共覆盖。据我们所知,我们是第一个研究贴上带有情感的话题问题的新机制。我们在三个真实世界数据集上的实验结果表明,采掘和抽象方法在便利专题理解和解释方面都超越了四个强势基线。此外,在比较采掘和抽象标签时,我们的评估表明,我们最有效的抽象方法能够以较少的文字提供更多的主题信息覆盖范围,成本是生成比富有感官味的合成方法要低得多。我们得出的结论是,抽象方法能够有效地包含丰富的合成专题。