胸胸透X光基金会模型中的Bias风险 (Risk of Bias in Chest X-ray Foundation Models)

Foundation models are considered a breakthrough in all applications of AI, promising robust and reusable mechanisms for feature extraction, alleviating the need for large amounts of high quality annotated training data for task-specific prediction models. However, foundation models may potentially encode and even reinforce existing biases present in historic datasets. Given the limited ability to scrutinize foundation models, it remains unclear whether the opportunities outweigh the risks in safety critical applications such as clinical decision making. In our statistical bias analysis of a recently published, and publicly accessible chest X-ray foundation model, we found reasons for concern as the model seems to encode protected characteristics including biological sex and racial identity. When used for the downstream application of disease detection, we observed substantial degradation of performance of the foundation model compared to a standard model with specific disparities in protected subgroups. While research into foundation models for healthcare applications is in an early stage, we hope to raise awareness of the risks by highlighting the importance of conducting thorough bias and subgroup performance analyses.

翻译：基础模型被认为在AI的所有应用中都是一个突破,有希望地物提取的可靠和可再利用的机制,减轻了为具体任务预测模型提供大量高质量附加说明的培训数据的必要性,然而,基础模型有可能编码甚至强化历史数据集中存在的现有偏见。鉴于仔细审查基础模型的能力有限,仍然不清楚这些机会是否大于临床决策等安全关键应用的风险。在对最近出版的胸腔X射线基础模型进行统计偏差分析以及向公众开放的胸腔X射线基础模型中,我们感到担忧的原因是,该模型似乎将受保护的特征,包括生物性别和种族特征编码起来。当用于下游疾病检测时,我们观察到基础模型的性能与受保护分组中存在具体差异的标准模型相比大大退化。虽然对保健应用基础模型的研究尚处于早期阶段,但我们希望通过强调彻底的偏见和分组性绩效分析的重要性来提高对风险的认识。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日