Opinion summarisation synthesises opinions expressed in a group of documents discussing the same topic to produce a single summary. Recent work has looked at opinion summarisation of clusters of social media posts. Such posts are noisy and have unpredictable structure, posing additional challenges for the construction of the summary distribution and the preservation of meaning compared to online reviews, which has been so far the focus of opinion summarisation. To address these challenges we present \textit{WassOS}, an unsupervised abstractive summarization model which makes use of the Wasserstein distance. A Variational Autoencoder is used to get the distribution of documents/posts, and the distributions are disentangled into separate semantic and syntactic spaces. The summary distribution is obtained using the Wasserstein barycenter of the semantic and syntactic distributions. A latent variable sampled from the summary distribution is fed into a GRU decoder with a transformer layer to produce the final summary. Our experiments on multiple datasets including Twitter clusters, Reddit threads, and reviews show that WassOS almost always outperforms the state-of-the-art on ROUGE metrics and consistently produces the best summaries with respect to meaning preservation according to human evaluations.
翻译:在一组讨论同一主题的文件中表达的意见总和合成合成意见,以产生单一摘要。最近的工作审视了社会媒体文章群集的意见汇总。这些文章吵闹,结构不可预测,对简要分发的构建和与在线审查相比保留意义构成额外挑战,而在线审查则一直是意见总结的重点。为了应对这些挑战,我们提出了一个使用瓦瑟斯坦距离的不受监督的抽象汇总模型,即:使用瓦瑟斯坦距离的不受监督的抽象汇总模型。我们用一个变异自动编码器进行多个数据集的实验,包括Twitter群集,重新编辑线条,以及审查显示WassOS系统几乎总是以保存最佳价值的方式制作国家摘要,并不断显示Wasserstein系统以维护最佳价值制作国家摘要。