随机数据计算中非随机数据失失失失的代称化建算建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建 (Identifiable Generative Models for Missing Not at Random Data Imputation) - 专知论文

会员服务 ·

0

可辨认的 · 数据填补 · MoDELS · 生成模型 · 有偏 ·

2021 年 10 月 27 日

Identifiable Generative Models for Missing Not at Random Data Imputation

翻译：随机数据计算中非随机数据失失失失的代称化建算建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建建

Chao Ma,Cheng Zhang

Real-world datasets often have missing values associated with complex generative processes, where the cause of the missingness may not be fully observed. This is known as missing not at random (MNAR) data. However, many imputation methods do not take into account the missingness mechanism, resulting in biased imputation values when MNAR data is present. Although there are a few methods that have considered the MNAR scenario, their model's identifiability under MNAR is generally not guaranteed. That is, model parameters can not be uniquely determined even with infinite data samples, hence the imputation results given by such models can still be biased. This issue is especially overlooked by many modern deep generative models. In this work, we fill in this gap by systematically analyzing the identifiability of generative models under MNAR. Furthermore, we propose a practical deep generative model which can provide identifiability guarantees under mild assumptions, for a wide range of MNAR mechanisms. Our method demonstrates a clear advantage for tasks on both synthetic data and multiple real-world scenarios with MNAR data.

翻译：现实世界数据集往往缺少与复杂的基因化过程相关的数值,而这种过程可能无法充分观察到缺失的原因。这并非随机(MNAR)数据,而是已知的缺失(MNAR)数据。然而,许多估算方法没有考虑到缺失机制,导致在有MNAR数据时有偏颇的估算值。虽然有少数方法考虑了MNAR假设,但其模型在MNAR下的可识别性一般得不到保证。也就是说,即使有无限的数据样本,模型参数也不能单独确定,因此这些模型提供的估算结果仍然可能有偏差。许多现代深层基因化模型尤其忽视了这一问题。在这项工作中,我们通过系统分析MNAR数据下的基因化模型的可识别性来填补这一空白。此外,我们提出了一个实用的深层次的基因化模型,可以在宽广的假设下提供可识别性的保证。我们的方法表明合成数据的任务和与MNAR数据有关的多种真实世界情景的明显优势。

0

相关内容

可辨认的

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

深度卷积神经网络图像语义分割研究进展

专知会员服务

86+阅读 · 2021年1月7日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

On Local Convergence of Iterative Hard Thresholding for Matrix Completion

On Local Convergence of Iterative Hard Thresholding for Matrix Completion

Arxiv

0+阅读 · 2021年12月29日

Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation

Arxiv

0+阅读 · 2021年12月29日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Arxiv

5+阅读 · 2019年2月8日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Generative Stock Question Answering

Arxiv

6+阅读 · 2018年4月21日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

深度卷积神经网络图像语义分割研究进展

专知会员服务

86+阅读 · 2021年1月7日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

On Local Convergence of Iterative Hard Thresholding for Matrix Completion

On Local Convergence of Iterative Hard Thresholding for Matrix Completion

Arxiv

0+阅读 · 2021年12月29日

Training time minimization for federated edge learning with optimized gradient quantization and bandwidth allocation

Arxiv

0+阅读 · 2021年12月29日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Pixel Level Data Augmentation for Semantic Image Segmentation using Generative Adversarial Networks

Arxiv

5+阅读 · 2019年2月8日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Generative Stock Question Answering

Arxiv

6+阅读 · 2018年4月21日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Arxiv

11+阅读 · 2018年2月10日

微信扫码咨询专知VIP会员