使用重要检查自动编码器处理电子健康记录数据中不可忽略、不可忽略的特征 (Handling Non-ignorably Missing Features in Electronic Health Records Data Using Importance-Weighted Autoencoders)

Electronic Health Records (EHRs) are commonly used to investigate relationships between patient health information and outcomes. Deep learning methods are emerging as powerful tools to learn such relationships, given the characteristic high dimension and large sample size of EHR datasets. The Physionet 2012 Challenge involves an EHR dataset pertaining to 12,000 ICU patients, where researchers investigated the relationships between clinical measurements, and in-hospital mortality. However, the prevalence and complexity of missing data in the Physionet data present significant challenges for the application of deep learning methods, such as Variational Autoencoders (VAEs). Although a rich literature exists regarding the treatment of missing data in traditional statistical models, it is unclear how this extends to deep learning architectures. To address these issues, we propose a novel extension of VAEs called Importance-Weighted Autoencoders (IWAEs) to flexibly handle Missing Not At Random (MNAR) patterns in the Physionet data. Our proposed method models the missingness mechanism using an embedded neural network, eliminating the need to specify the exact form of the missingness mechanism a priori. We show that the use of our method leads to more realistic imputed values relative to the state-of-the-art, as well as significant differences in fitted downstream models for mortality.

翻译：健康电子记录(EHRs)通常用于调查病人健康信息和结果之间的关系。深层次学习方法正在成为学习这种关系的有力工具,因为传统统计模型中缺少数据的处理方法十分丰富,但尚不清楚这如何延伸至深层次学习结构。为了解决这些问题,我们提议对称为“重要-视觉自动计算器”的VAES进行新的扩展,以灵活处理Physionet数据中失踪的Not At Rang(MNAR)模式。我们提出的方法模型是使用嵌入的神经神经网络(VAEs)等深层学习方法的缺失率机制,从而消除了对旧统计模型中缺失数据处理的准确形式的需求。我们提议将“重要-视觉自动计算器(IWAES)”的扩展,以便灵活处理Physionet数据中失踪的Not At Rang(MAR)模式。我们提出的方法模型用嵌入式神经网络(VAEural)来模拟缺失的死亡率机制,从而消除了对前层相对价值的准确性机制的精确形式。我们提出了更精确的下游模型。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日