Assessing the trustworthiness of artificial intelligence systems requires knowledge from many different disciplines. These disciplines do not necessarily share concepts between them and might use words with different meanings, or even use the same words differently. Additionally, experts from different disciplines might not be aware of specialized terms readily used in other disciplines. Therefore, a core challenge of the assessment process is to identify when experts from different disciplines talk about the same problem but use different terminologies. In other words, the problem is to group problem descriptions (a.k.a. issues) with the same semantic meaning but described using slightly different terminologies. In this work, we show how we employed recent advances in natural language processing, namely sentence embeddings and semantic textual similarity, to support this identification process and to bridge communication gaps in interdisciplinary teams of experts assessing the trustworthiness of an artificial intelligence system used in healthcare.
翻译:评估人造情报系统的可信赖性需要来自许多不同学科的知识,这些学科不一定在它们之间分享概念,可能使用不同含义的词,甚至使用不同的词。此外,不同学科的专家可能不知道其他学科容易使用的专门术语。因此,评估过程的一个核心挑战是确定不同学科的专家何时谈论同样的问题,但使用不同的术语。换句话说,问题在于将问题描述(a.k.a.问题)分为相同的语义含义,但使用略有不同的术语加以描述。在这项工作中,我们展示了我们如何利用在自然语言处理方面的最新进展,即判决嵌入和语义文本相似性,以支持这一识别过程,并弥合评估保健中使用的人工情报系统的可信赖性的跨学科专家小组之间的沟通差距。