Adverse Drug Event (ADE) extraction models can rapidly examine large collections of social media texts, detecting mentions of drug-related adverse reactions and trigger medical investigations. However, despite the recent advances in NLP, it is currently unknown if such models are robust in face of negation, which is pervasive across language varieties. In this paper we evaluate three state-of-the-art systems, showing their fragility against negation, and then we introduce two possible strategies to increase the robustness of these models: a pipeline approach, relying on a specific component for negation detection; an augmentation of an ADE extraction dataset to artificially create negated samples and further train the models. We show that both strategies bring significant increases in performance, lowering the number of spurious entities predicted by the models. Our dataset and code will be publicly released to encourage research on the topic.
翻译:反毒品事件(ADE)提取模型可以迅速检查大量社交媒体文本,发现与毒品有关的负面反应的提及,并触发医学调查;然而,尽管国家药物评估方案最近有所进展,但目前尚不清楚这些模型是否在面临各种语言的否定的情况下是强有力的,这种否定是普遍存在的;在本文件中,我们评估了三种最先进的系统,表明它们的脆弱性,反对否定,然后我们提出两种可能的战略来增强这些模型的稳健性:一种管道方法,依靠一个特定的反光检测成分;一种扩大ADE提取数据集,以人工生成否定的样本并进一步培训模型;我们表明,这两种战略都带来了显著的性能提高,降低了模型预测的虚假实体的数量;我们的数据集和代码将公开发布,以鼓励对这一专题的研究。