Opinion summarization is the task of automatically generating summaries for a set of reviews about a specific target (e.g., a movie or a product). Since the number of reviews for each target can be prohibitively large, neural network-based methods follow a two-stage approach where an extractive step first pre-selects a subset of salient opinions and an abstractive step creates the summary while conditioning on the extracted subset. However, the extractive model leads to loss of information which may be useful depending on user needs. In this paper we propose a summarization framework that eliminates the need to rely only on pre-selected content and waste possibly useful information, especially when customizing summaries. The framework enables the use of all input reviews by first condensing them into multiple dense vectors which serve as input to an abstractive model. We showcase an effective instantiation of our framework which produces more informative summaries and also allows to take user preferences into account using our zero-shot customization technique. Experimental results demonstrate that our model improves the state of the art on the Rotten Tomatoes dataset and generates customized summaries effectively.
翻译:意见总和是自动生成关于具体目标(如电影或产品)的一组审查摘要的任务。由于对每个目标的审查数量之大可能令人望而却步,因此神经网络方法遵循两阶段办法,即采掘步骤首先预选一组突出意见和抽象步骤,在对提取的子集进行限制时产生摘要。然而,采掘模型导致信息丢失,而这些信息可能根据用户的需要而有用。在本文件中,我们提议了一个总结框架,消除仅依赖预选的内容和浪费可能有用的信息的必要性,特别是在定制摘要时。这个框架使得所有投入审查都能被首先浓缩到多个密度矢量中,作为抽象模型的投入。我们展示了我们框架的有效回溯性,这种框架能产生更丰富的摘要,并允许用户偏好使用我们的零光定制定制技术。实验结果表明,我们的模型改进了罗滕托马托斯数据集的艺术状态,并有效地生成了定制的概要。