DR.BENCH:临床自然语言处理诊断理由基准 (DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing)

The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgment that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce the cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a novel suite of tasks coined as Diagnostic Reasoning Benchmarks, DR.BENCH, as a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. The suite includes six tasks from ten publicly available datasets addressing clinical text understanding, medical knowledge reasoning, and diagnosis generation. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models. Experiments with state-of-the-art pre-trained generative language models using large general domain models and models that were continually trained on a medical corpus demonstrate opportunities for improvement when evaluated in DR. BENCH. We share DR. BENCH as a publicly available GitLab repository with a systematic approach to load and evaluate models for the cNLP community.

翻译：电子健康记录(EHR)的有意义使用在数字时代继续取得进展,临床决策支持系统得到了人工智能的加强,改进提供者经验的一个优先事项是克服信息超载,减少认知负担,从而减少病人护理期间出现医疗错误和认知偏差。主要的医疗错误之一是诊断错误,因为判断中有系统或可预测的错误,而这种错误依赖休眠症。临床自然语言处理(cNLP)的潜力是人体诊断推理模型,从从数据到诊断的推理到可能减少认知负担和医疗错误。现有的推进CNLP科学的任务主要侧重于信息提取和通过分类任务命名实体识别。我们引入了一套新颖的任务,称为诊断判断性判断性判断基准,DR.BENCH,作为开发和评价具有临床诊断推理能力的CNLP模型的新基准。这套任务包括从公开提供的10套数据到临床文本理解、医学知识推理和诊断生成的理论。DR.B.ENCH是第一个设计为B-语言生成前的临床成本模型的临床组合组合,在经过不断培训的模型中,用经过不断培训的模型来评估。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日