Foundation models are considered a breakthrough in all applications of AI, promising robust and reusable mechanisms for feature extraction, alleviating the need for large amounts of high quality training data for task-specific prediction models. However, foundation models may potentially encode and even reinforce existing biases present in historic datasets. Given the limited ability to scrutinize foundation models, it remains unclear whether the opportunities outweigh the risks in safety critical applications such as clinical decision making. In our statistical bias analysis of a recently published, and publicly available chest X-ray foundation model, we found reasons for concern as the model seems to encode protected characteristics including biological sex and racial identity, which may lead to disparate performance across subgroups in downstream applications. While research into foundation models for healthcare applications is in an early stage, we believe it is important to make the community aware of these risks to avoid harm.
翻译:基础模型被认为在AI的所有应用中都是一个突破,有希望地物提取的可靠和可重复使用的机制,减轻了对任务特定预测模型大量高质量培训数据的需求,然而,基础模型有可能将历史数据集中存在的现有偏差编码甚至强化。鉴于仔细审查基础模型的能力有限,仍然不清楚这些机会是否大于临床决策等安全关键应用的风险。在对最近出版的胸腔X射线基础模型的统计偏差分析以及公开提供的胸腔X光基础模型中,我们感到担忧的原因是,该模型似乎将受保护的特征,包括生物性别和种族特征编码起来,这可能导致下游应用中各分组的不同性能。虽然对医疗应用基础模型的研究尚处于早期阶段,但我们认为必须让社区意识到这些风险以避免伤害。