Healthcare AI holds the potential to increase patient safety, augment efficiency and improve patient outcomes, yet research is often limited by data access, cohort curation, and tooling for analysis. Collection and translation of electronic health record data, live data, and real-time high resolution device data can be challenging and time-consuming. The development of real-world AI tools requires overcoming challenges in data acquisition, scarce hospital resources and high needs for data governance. These bottlenecks may result in resource-heavy needs and long delays in research and development of AI systems. We present a system and methodology to accelerate data acquisition, dataset development and analysis, and AI model development. We created an interactive platform that relies on a scalable microservice backend. This system can ingest 15,000 patient records per hour, where each record represents thousands of multimodal measurements, text notes, and high resolution data. Collectively, these records can approach a terabyte of data. The system can further perform cohort generation and preliminary dataset analysis in 2-5 minutes. As a result, multiple users can collaborate simultaneously to iterate on datasets and models in real time. We anticipate that this approach will drive real-world AI model development, and, in the long run, meaningfully improve healthcare delivery.
翻译:医疗护理协会拥有提高病人安全、提高效率和改善病人结果的潜力,但研究往往受到数据存取、组群整理和分析以及分析工具的限制。电子健康记录数据、实时数据和实时高分辨率设备数据的收集和翻译可能具有挑战性和耗时性。现实世界的AI工具的开发需要克服数据获取方面的挑战、医院资源稀缺和数据治理的高度需求。这些瓶颈可能导致资源过剩的需求以及AI系统研发的长期拖延。我们提出了一个系统和方法,以加速数据获取、数据集开发和分析以及AI模型开发。我们创建了一个互动平台,该平台依赖于一个可缩放的微观服务后端。这个系统每小时最多可接收15,000个病人记录,其中每个记录代表数千个多式测量、文本说明和高分辨率数据。这些记录合起来,可以接近一个数据兆字节。这个系统可以在2-5分钟内进一步进行组群生成和初步数据集分析。因此,多个用户可以同时在数据集和模型开发方面进行合作。我们预计,这一方法将在实时改进真实的AI模式和模型的交付。