We introduce the task of microblog opinion summarisation (MOS) and share a dataset of 3100 gold-standard opinion summaries to facilitate research in this domain. The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarisation dataset. Summaries are abstractive in nature and have been created by journalists skilled in summarising news articles following a template separating factual information (main story) from author opinions. Our method differs from previous work on generating gold-standard summaries from social media, which usually involves selecting representative posts and thus favours extractive summarisation models. To showcase the dataset's utility and challenges, we benchmark a range of abstractive and extractive state-of-the-art summarisation models and achieve good performance, with the former outperforming the latter. We also show that fine-tuning is necessary to improve performance and investigate the benefits of using different sample sizes.
翻译:我们引入了微博客观点总结(MOS)的任务,并分享了3100个黄金标准观点摘要数据集,以促进这一领域的研究。数据集包含为期两年的推文摘要,覆盖的专题比任何其他公共Twitter汇总数据集都多。摘要具有抽象性质,由在将事实信息(主要故事)与作者观点区分开来时精通总结新闻文章的记者制作。我们的方法不同于以往从社交媒体中生成黄金标准摘要的工作,以往的工作通常涉及选择有代表性的职位,从而有利于采掘总结模型。为了展示数据集的效用和挑战,我们为一系列抽象和采掘最新总结模型设定了基准,并取得了优异业绩,而前者则优于后者。我们还表明,必须进行微调,以改进业绩,并调查使用不同样本大小的好处。