Nowadays the medical domain is receiving more and more attention in applications involving Artificial Intelligence. Clinicians have to deal with an enormous amount of unstructured textual data to make a conclusion about patients' health in their everyday life. Argument mining helps to provide a structure to such data by detecting argumentative components in the text and classifying the relations between them. However, as it is the case for many tasks in Natural Language Processing in general and in medical text processing in particular, the large majority of the work on computational argumentation has been done only for English. This is also the case with the only dataset available for argumentation in the medical domain, namely, the annotated medical data of abstracts of Randomized Controlled Trials (RCT) from the MEDLINE database. In order to mitigate the lack of annotated data for other languages, we empirically investigate several strategies to perform argument mining and classification in medical texts for a language for which no annotated data is available. This project shows that automatically translating and project annotations from English to a target language (Spanish) is an effective way to generate annotated data without manual intervention. Furthermore, our experiments demonstrate that the translation and projection approach outperforms zero-shot cross-lingual approaches using a large masked multilingual language model. Finally, we show how the automatically generated data in Spanish can also be used to improve results in the original English evaluation setting.
翻译:目前,医学领域在涉及人工智能的应用中越来越受到越来越多的关注。临床医生必须处理大量非结构化的文本数据,以得出患者日常生活健康的结论。理论挖掘有助于通过检测文本中有争议的组成部分和对彼此关系进行分类,为这些数据提供一个结构。然而,由于在一般的自然语言处理和特别是医学文本处理中的许多任务中,绝大多数关于计算论证的工作只针对英语。临床医生必须处理大量非结构化文本数据,以便得出关于患者日常生活健康的结论。医学领域仅有的一套数据,即MEDLINE数据库中随机控制试验摘要的附加说明医学数据,有助于为这些数据提供结构结构。然而,由于在一般的自然语言处理中,特别是在医学文本处理中的许多任务中,大多数关于计算论证的工作都是为从英语到目标语言(西班牙语)进行的。这个项目表明,在没有人工干预的情况下,自动翻译和项目说明生成附加说明的数据的有效方法,即:Medrical Controduction Treaticalalizations a exproply production practly productions, ex exally ex