使用经认真检查的自动算法计算非可忽略的数据 (Unsupervised Imputation of Non-ignorably Missing Data Using Importance-Weighted Autoencoders) - 专知论文

会员服务 ·

0

Learning · 自编码器 · 无监督 · 复合数据 · 变分自编码 ·

2022 年 6 月 17 日

Unsupervised Imputation of Non-ignorably Missing Data Using Importance-Weighted Autoencoders

翻译：使用经认真检查的自动算法计算非可忽略的数据

David K. Lim,Naim U. Rashid,Junier B. Oliva,Joseph G. Ibrahim

from arxiv, 31 pages, 4 figures, 2 tables, under review (Biometrics Methodology)

Deep Learning (DL) methods have dramatically increased in popularity in recent years. While its initial success was demonstrated in the classification and manipulation of image data, there has been significant growth in the application of DL methods to problems in the biomedical sciences. However, the greater prevalence and complexity of missing data in biomedical datasets present significant challenges for DL methods. Here, we provide a formal treatment of missing data in the context of Variational Autoencoders (VAEs), a popular unsupervised DL architecture commonly utilized for dimension reduction, imputation, and learning latent representations of complex data. We propose a new VAE architecture, NIMIWAE, that is one of the first to flexibly account for both ignorable and non-ignorable patterns of missingness in input features at training time. Following training, samples can be drawn from the approximate posterior distribution of the missing data can be used for multiple imputation, facilitating downstream analyses on high dimensional incomplete datasets. We demonstrate through statistical simulation that our method outperforms existing approaches for unsupervised learning tasks and imputation accuracy. We conclude with a case study of an EHR dataset pertaining to 12,000 ICU patients containing a large number of diagnostic measurements and clinical outcomes, where many features are only partially observed.

翻译：近些年来,深入学习(DL)方法的普及程度急剧提高。虽然在图像数据的分类和操作方面显示了最初的成功,但在应用DL方法处理生物医学科学问题方面却取得了显著的成绩;然而,生物医学数据集中缺失的数据更加普遍和复杂,给DL方法带来了重大挑战。在这里,我们正式处理在变形自动编码器(VAE)中缺失的数据,这是一个流行的、不受监督的DL结构,通常用于减少尺寸、估算和学习复杂数据的潜在表现。我们提出了一个新的VAE结构,NIMIWAE,这是首次灵活地说明在培训时间输入特征中可忽略和不可忽略的模式之一。在培训之后,可以从缺失数据的近似表面分布中抽取样本,用于多发,便利对高维度不完整数据集进行下游分析。我们通过统计模拟表明,我们的方法超越了现有的不协调的学习任务和倾斜度精确度的精确度。我们通过案例研究,得出了大部分EHR的临床测量结果,其中只有EHR的诊断结果。

0

相关内容

Learning

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

低品位烧结矿的三维微观重建及成矿与还原机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

MIMP调控miR-21-Ras/MAPK通路抑制具核梭杆菌致肠上皮细胞癌变的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

抗积碳A位缺陷Sr2MgMoO6/Cu复合SOFC阳极材料的研究

国家自然科学基金

0+阅读 · 2013年12月31日

多因素不确定情况下路面最优养护维修策略决策方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

酰基辅酶A去饱和酶1（SCD1）促进Her2诱导的乳腺癌发生及演进

国家自然科学基金

0+阅读 · 2012年12月31日

基于物理和几何的相变与凝聚现象

国家自然科学基金

0+阅读 · 2012年12月31日

相关于算子的Orlicz-型函数空间的实变理论

国家自然科学基金

0+阅读 · 2011年12月31日

携带凋亡素基因的靶向结肠癌的流感病毒载体研究

国家自然科学基金

0+阅读 · 2010年12月31日

水稻OsCAS（Calcium-sensing Receptor）基因的功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents

Arxiv

0+阅读 · 2022年8月8日

Systematic Review of Newton-Schulz Iterations with Unified Factorizations : Integration in the Richardson Method and Application to Robust Failure Detection in Electrical Networks

Arxiv

0+阅读 · 2022年8月8日

The Effect of Sample Size and Missingness on Inference with Missing Data

Arxiv

0+阅读 · 2022年8月7日

Contact-Implicit Trajectory Optimization with Hydroelastic Contact and iLQR

Arxiv

0+阅读 · 2022年8月7日

IDLat: An Importance-Driven Latent Generation Method for Scientific Data

Arxiv

0+阅读 · 2022年8月5日

Unsupervised Tissue Segmentation via Deep Constrained Gaussian Network

Arxiv

0+阅读 · 2022年8月4日

Statistical Inference for Streamed Longitudinal Data

Arxiv

0+阅读 · 2022年8月4日

Efficiently Generating Independent Samples Directly from the Posterior Distribution for a Large Class of Bayesian Generalized Linear Mixed Effects Models

Arxiv

0+阅读 · 2022年8月4日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Arxiv

10+阅读 · 2018年4月29日

VIP会员

文章信息

相关主题

变分自编码

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《代码、指挥与冲突：描绘军事人工智能的未来》报告

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents

Arxiv

0+阅读 · 2022年8月8日

Systematic Review of Newton-Schulz Iterations with Unified Factorizations : Integration in the Richardson Method and Application to Robust Failure Detection in Electrical Networks

Arxiv

0+阅读 · 2022年8月8日

The Effect of Sample Size and Missingness on Inference with Missing Data

Arxiv

0+阅读 · 2022年8月7日

Contact-Implicit Trajectory Optimization with Hydroelastic Contact and iLQR

Arxiv

0+阅读 · 2022年8月7日

IDLat: An Importance-Driven Latent Generation Method for Scientific Data

Arxiv

0+阅读 · 2022年8月5日

Unsupervised Tissue Segmentation via Deep Constrained Gaussian Network

Arxiv

0+阅读 · 2022年8月4日

Statistical Inference for Streamed Longitudinal Data

Arxiv

0+阅读 · 2022年8月4日

Efficiently Generating Independent Samples Directly from the Posterior Distribution for a Large Class of Bayesian Generalized Linear Mixed Effects Models

Arxiv

0+阅读 · 2022年8月4日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Arxiv

10+阅读 · 2018年4月29日

相关基金

低品位烧结矿的三维微观重建及成矿与还原机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

MIMP调控miR-21-Ras/MAPK通路抑制具核梭杆菌致肠上皮细胞癌变的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

抗积碳A位缺陷Sr2MgMoO6/Cu复合SOFC阳极材料的研究

国家自然科学基金

0+阅读 · 2013年12月31日

多因素不确定情况下路面最优养护维修策略决策方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

酰基辅酶A去饱和酶1（SCD1）促进Her2诱导的乳腺癌发生及演进

国家自然科学基金

0+阅读 · 2012年12月31日

基于物理和几何的相变与凝聚现象

国家自然科学基金

0+阅读 · 2012年12月31日

相关于算子的Orlicz-型函数空间的实变理论

国家自然科学基金

0+阅读 · 2011年12月31日

携带凋亡素基因的靶向结肠癌的流感病毒载体研究

国家自然科学基金

0+阅读 · 2010年12月31日

水稻OsCAS（Calcium-sensing Receptor）基因的功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员