This article details the creation of a novel domain ontology at the intersection of epidemiology, medicine, statistics, and computer science. Using the terminology defined by current legislation, the article outlines a systematic approach to handling hospital data anonymously in preparation for its use in Artificial Intelligence (AI) applications in healthcare. The development process consisted of 7 pragmatic steps, including defining scope, selecting knowledge, reviewing important terms, constructing classes that describe designs used in epidemiological studies, machine learning paradigms, types of data and attributes, risks that anonymized data may be exposed to, privacy attacks, techniques to mitigate re-identification, privacy models, and metrics for measuring the effects of anonymization. The article concludes by demonstrating the practical implementation of this ontology in hospital settings for the development and validation of AI.
翻译:本文介绍了一种新颖的领域本体论,涉及流行病学、医学、统计学和计算机科学的交叉领域。使用当前法规所定义的术语,本文概述了一种系统处理匿名医院数据的方法,以准备将其用于医疗人工智能(AI)应用。开发过程包括7个实用的步骤,包括定义范围、选择知识、审查重要术语、构建描述流行病学研究中使用的设计、机器学习范例、数据类型和属性、匿名数据可能面临的风险、隐私攻击、缓解重新识别的技术、隐私模型以及测量匿名化效果的指标。最后本文演示了本体论在医院设置中的实际应用,用于开发和验证 AI。