衡量科学通讯中判决水平和背景(不)程度(不)确定性 (Measuring Sentence-Level and Aspect-Level (Un)certainty in Science Communications)

Certainty and uncertainty are fundamental to science communication. Hedges have widely been used as proxies for uncertainty. However, certainty is a complex construct, with authors expressing not only the degree but the type and aspects of uncertainty in order to give the reader a certain impression of what is known. Here, we introduce a new study of certainty that models both the level and the aspects of certainty in scientific findings. Using a new dataset of 2167 annotated scientific findings, we demonstrate that hedges alone account for only a partial explanation of certainty. We show that both the overall certainty and individual aspects can be predicted with pre-trained language models, providing a more complete picture of the author's intended communication. Downstream analyses on 431K scientific findings from news and scientific abstracts demonstrate that modeling sentence-level and aspect-level certainty is meaningful for areas like science communication. Both the model and datasets used in this paper are released at https://blablablab.si.umich.edu/projects/certainty/.

翻译：肯定性和不确定性是科学交流的基础。隐蔽和不确定性被广泛用作不确定性的替代物。然而,确定性是一个复杂的概念,作者不仅表达不确定性的程度,而且表达不确定性的类型和方面,以便使读者对已知情况有一定的印象。在这里,我们引入了一项新的确定性研究,以科学发现中确定性的水平和方面为模型。我们使用2167年附加说明的科学发现的新数据集,表明单靠对冲只能部分解释确定性。我们表明,通过预先培训的语言模型可以预测总体确定性和个别方面,更完整地描述作者的打算通信。对新闻和科学摘要中431K科学调查结果的下游分析表明,在科学通信等领域,模拟判决水平和层面的确定性是有意义的。本文中使用的模型和数据集都在https://blablablabab.si.umich.edu/projects/certainty/上公布。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/