Relevance in summarization is typically defined based on textual information alone, without incorporating insights about a particular decision. As a result, to support risk analysis of pancreatic cancer, summaries of medical notes may include irrelevant information such as a knee injury. We propose a novel problem, decision-focused summarization, where the goal is to summarize relevant information for a decision. We leverage a predictive model that makes the decision based on the full text to provide valuable insights on how a decision can be inferred from text. To build a summary, we then select representative sentences that lead to similar model decisions as using the full text while accounting for textual non-redundancy. To evaluate our method (DecSum), we build a testbed where the task is to summarize the first ten reviews of a restaurant in support of predicting its future rating on Yelp. DecSum substantially outperforms text-only summarization methods and model-based explanation methods in decision faithfulness and representativeness. We further demonstrate that DecSum is the only method that enables humans to outperform random chance in predicting which restaurant will be better rated in the future.
翻译:概括中的相关性通常仅根据文字信息来界定,而没有包含对特定决定的洞察力。因此,为支持对胰腺癌进行风险分析,医疗说明摘要可能包含不相关的信息,如膝部受伤。我们提出了一个新颖的问题,即以决定为焦点的总结性总结,目的是为作出决定总结相关信息。我们利用一个预测模型,使决定以全文为基础,就如何从案文中推断出一项决定提供宝贵的见解。为了构建一个摘要,我们然后选择一些有代表性的句子,导致作出类似示范决定,如使用全文,同时计算文本不重复。为了评估我们的方法(DecSum),我们建立了一个测试台,用于总结一家餐馆的最初十次审查,以支持预测其未来在Yelp上的评级。DecSum在很大程度上超越了仅有文本的总结方法和基于模型的解释方法,在决定的忠诚性和代表性方面。我们进一步证明,DecSum是唯一能使人类在预测未来哪些餐馆的评级会更好时超过随机机率的方法。