模拟总结模型量化价值的检验 (Probing of Quantitative Values in Abstractive Summarization Models)

Abstractive text summarization has recently become a popular approach, but data hallucination remains a serious problem, including with quantitative data. We propose a set of probing tests to evaluate the efficacy of abstract summarization models' modeling of quantitative values found in the input text. Our results show that in most cases, the encoders of recent SOTA-performing models struggle to provide embeddings that adequately represent quantitative values in the input compared to baselines, and in particular, they outperform random representations in some, but surprisingly not all, cases. Under our assumptions, this suggests that the encoder's performance contributes to the quantity hallucination problem. One model type in particular, DistilBART-CDM, was observed to underperform randomly initialized representations for several experiments, and performance versus BERT suggests that standard pretraining and fine-tuning approaches for the summarization task may play a role in underperformance for some encoders.

翻译：抽象文本摘要最近已成为一种流行的做法,但数据幻觉仍然是一个严重问题,包括数量数据。我们提出一套测试方法,以评价抽象摘要模型对输入文本中发现的数量值的模型的建模效果。我们的结果显示,在多数情况下,最近SOTA执行模型的编码者努力提供与基线相比在输入中充分代表数量值的嵌入,特别是,它们在某些情况中,但并非在所有情况中都比随机表示得好,令人惊讶。根据我们的假设,这意味着编码器的性能助长了数量幻觉问题。一个模型类型,特别是DistilBART-CDM,被观察到在若干实验中出现不完善的随机初始表示方式,而业绩与BERT相比表明,对总化任务的标准预培训和微调方法可能在某些编码者表现不佳方面起到作用。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日