Recent approaches for unsupervised opinion summarization have predominantly used the review reconstruction training paradigm. An encoder-decoder model is trained to reconstruct single reviews and learns a latent review encoding space. At summarization time, the unweighted average of latent review vectors is decoded into a summary. In this paper, we challenge the convention of simply averaging the latent vector set, and claim that this simplistic approach fails to consider variations in the quality of input reviews or the idiosyncrasies of the decoder. We propose Coop, a convex vector aggregation framework for opinion summarization, that searches for better combinations of input reviews. Coop requires no further supervision and uses a simple word overlap objective to help the model generate summaries that are more consistent with input reviews. Experimental results show that extending opinion summarizers with Coop results in state-of-the-art performance, with ROUGE-1 improvements of 3.7% and 2.9% on the Yelp and Amazon benchmark datasets, respectively.
翻译:最新的未经监督的意见汇总方法主要使用审查重建培训模式。 编码器- 编码器模式经过培训, 以重建单一审查, 并学习潜在审查编码空间 。 汇总时间, 未加权的潜在审查矢量平均值被解译为摘要 。 在本文中, 我们质疑简单平均潜在矢量集的公约, 并声称这种简单化的方法没有考虑输入审查质量或解码器特性的差异 。 我们提议了Coop, 即用于意见汇总的 convex矢量聚合框架, 以寻找更好的输入审查组合 。 库不需要进一步监管, 并使用简单的单词重叠目标来帮助模型生成与投入审查更加一致的摘要 。 实验结果显示, 在最新业绩中, 扩展带有Coop结果的意见汇总器, 在 Yelp 和 Amazon 基准数据集上分别改进3. 7% 和 2.9% 。