Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience. We argue that unlocking this potential requires a systematic way to measure the performance of medical AI models on large-scale heterogeneous data. To meet this need, we are building MedPerf, an open framework for benchmarking machine learning in the medical domain. MedPerf will enable federated evaluation in which models are securely distributed to different facilities for evaluation, thereby empowering healthcare organizations to assess and verify the performance of AI models in an efficient and human-supervised process, while prioritizing privacy. We describe the current challenges healthcare and AI communities face, the need for an open platform, the design philosophy of MedPerf, its current implementation status, and our roadmap. We call for researchers and organizations to join us in creating the MedPerf open benchmarking platform.
翻译:医学大赦国际通过支持基于证据的医学实践、使病人治疗个人化、降低成本以及改善提供者和病人的经验,具有促进保健的巨大潜力。我们争辩说,释放这一潜力需要系统的方法来衡量医学AI模型在大规模多种数据方面的性能。为满足这一需要,我们正在建立MedPerf,这是医学领域机器学习基准的公开框架。MedPerf将使得能够进行联合评价,将模型安全地分发给不同的设施进行评价,从而使保健组织能够评估和核实AI模型在高效和人类监督下的进程中的性能,同时优先考虑隐私。我们描述了医疗保健和大赦国际社区目前面临的挑战、对开放平台的需要、MedPerf的设计理念、目前的实施状况和我们的路线图。我们呼吁研究人员和组织与我们一道创建MedPerf开放的基准平台。