缺失增强: 改进生成插值模型的通用方法 (Missingness Augmentation: A General Approach for Improving Generative Imputation Models) - 专知论文

会员服务 ·

0

数据增广 · 损失 · 基础问题 · 样本 · 缺失数据 ·

2023 年 4 月 6 日

Missingness Augmentation: A General Approach for Improving Generative Imputation Models

翻译：缺失增强: 改进生成插值模型的通用方法

Yufeng Wang,Dan Li,Cong Xu,Min Yang

from arxiv, 20 pages

Missing data imputation is a fundamental problem in data analysis, and many studies have been conducted to improve its performance by exploring model structures and learning procedures. However, data augmentation, as a simple yet effective method, has not received enough attention in this area. In this paper, we propose a novel data augmentation method called Missingness Augmentation (MisA) for generative imputation models. Our approach dynamically produces incomplete samples at each epoch by utilizing the generator's output, constraining the augmented samples using a simple reconstruction loss, and combining this loss with the original loss to form the final optimization objective. As a general augmentation technique, MisA can be easily integrated into generative imputation frameworks, providing a simple yet effective way to enhance their performance. Experimental results demonstrate that MisA significantly improves the performance of many recently proposed generative imputation models on a variety of tabular and image datasets. The code is available at \url{https://github.com/WYu-Feng/Missingness-Augmentation}.

翻译：缺失数据插值是数据分析中的一项基础问题。过去的研究通过探索模型结构和学习过程来提高插值性能，但是数据增广作为一种简单但有效的方法，在该领域中却没有得到足够的关注。本文提出了一种新的数据增广方法——缺失增强（MisA），用于用于处理生成模型的插值问题。该方法利用生成器的输出动态产生不完整的样本，使用简单的重构损失约束增广样本，并将此损失与原始损失合并，形成最终的优化目标。作为一种通用的增广技术，MisA 可以轻松地集成到生成插值框架中，为提高性能提供了一种简单有效的方法。实验结果表明，MisA 显著提高了许多最近提出的生成插值模型在多个表格和图像数据集上的性能。该项目的代码可在 \url{https://github.com/WYu-Feng/Missingness-Augmentation} 上找到。

0

相关内容

数据增广

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

63+阅读 · 2023年2月15日

【NeurIPS2022】隐空间变换解决GAN生成分布的非连续性问题

【NeurIPS2022】隐空间变换解决GAN生成分布的非连续性问题

专知会员服务

26+阅读 · 2022年11月30日

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

专知会员服务

12+阅读 · 2022年3月9日

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

专知会员服务

90+阅读 · 2020年7月9日

【AAAI 2020】InteractE: 通过增加特征交互来改进基于卷积的知识图谱嵌入， InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

【AAAI 2020】InteractE: 通过增加特征交互来改进基于卷积的知识图谱嵌入， InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

专知会员服务

53+阅读 · 2020年6月7日

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

专知会员服务

22+阅读 · 2020年6月3日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

专知会员服务

37+阅读 · 2020年2月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【NeurIPS2022】隐空间变换解决GAN生成分布的非连续性问题

【NeurIPS2022】隐空间变换解决GAN生成分布的非连续性问题

专知

0+阅读 · 2022年11月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

基于连续循环平移理论的Shearlet域稀疏表示SAR图像去噪算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

超大三维模型的快速离散测地线算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

薄势垒增强型AlGaN/GaN HEMT及可靠性研究

国家自然科学基金

0+阅读 · 2013年12月31日

功率变换器非线性不稳定行为的washout滤波器控制方法

国家自然科学基金

0+阅读 · 2012年12月31日

对象模型上交互式修复生成技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

改进FFT动态信号分析方法及在电力谐波检测中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

Federated Multi-organ Segmentation with Inconsistent Labels

Arxiv

0+阅读 · 2023年5月25日

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

Arxiv

0+阅读 · 2023年5月25日

Robust Classification via a Single Diffusion Model

Arxiv

0+阅读 · 2023年5月24日

DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion

Arxiv

0+阅读 · 2023年5月24日

SKED: Sketch-guided Text-based 3D Editing

Arxiv

0+阅读 · 2023年5月23日

NCHO: Unsupervised Learning for Neural 3D Composition of Humans and Objects

Arxiv

0+阅读 · 2023年5月23日

Generating Data for Symbolic Language with Large Language Models

Arxiv

0+阅读 · 2023年5月23日

Understanding Diffusion Models: A Unified Perspective

Arxiv

14+阅读 · 2022年8月25日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

63+阅读 · 2023年2月15日

【NeurIPS2022】隐空间变换解决GAN生成分布的非连续性问题

【NeurIPS2022】隐空间变换解决GAN生成分布的非连续性问题

专知会员服务

26+阅读 · 2022年11月30日

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

【CVPR 2022-UCSD&英伟达】GroupViT:从文本监督中产生语义分割，Semantic Segmentation Emerges from Text Supervision

专知会员服务

12+阅读 · 2022年3月9日

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

专知会员服务

90+阅读 · 2020年7月9日

【AAAI 2020】InteractE: 通过增加特征交互来改进基于卷积的知识图谱嵌入， InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

【AAAI 2020】InteractE: 通过增加特征交互来改进基于卷积的知识图谱嵌入， InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

专知会员服务

53+阅读 · 2020年6月7日

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

【SIGIR2020】一个统一的双视图模型，用于具有不一致性损失的评论总结和情绪分类，A Unified Dual-view Model for Review Summarization and Sentiment Classification with Inconsistency Loss

专知会员服务

22+阅读 · 2020年6月3日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

专知会员服务

37+阅读 · 2020年2月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

【NeurIPS2022】隐空间变换解决GAN生成分布的非连续性问题

【NeurIPS2022】隐空间变换解决GAN生成分布的非连续性问题

专知

0+阅读 · 2022年11月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

相关论文

Federated Multi-organ Segmentation with Inconsistent Labels

Arxiv

0+阅读 · 2023年5月25日

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

Arxiv

0+阅读 · 2023年5月25日

Robust Classification via a Single Diffusion Model

Arxiv

0+阅读 · 2023年5月24日

DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion

Arxiv

0+阅读 · 2023年5月24日

SKED: Sketch-guided Text-based 3D Editing

Arxiv

0+阅读 · 2023年5月23日

NCHO: Unsupervised Learning for Neural 3D Composition of Humans and Objects

Arxiv

0+阅读 · 2023年5月23日

Generating Data for Symbolic Language with Large Language Models

Arxiv

0+阅读 · 2023年5月23日

Understanding Diffusion Models: A Unified Perspective

Arxiv

14+阅读 · 2022年8月25日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

相关基金

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

基于连续循环平移理论的Shearlet域稀疏表示SAR图像去噪算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

超大三维模型的快速离散测地线算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

薄势垒增强型AlGaN/GaN HEMT及可靠性研究

国家自然科学基金

0+阅读 · 2013年12月31日

功率变换器非线性不稳定行为的washout滤波器控制方法

国家自然科学基金

0+阅读 · 2012年12月31日

对象模型上交互式修复生成技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于Sparse-Land模型的SAR图像噪声抑制与分割

国家自然科学基金

0+阅读 · 2009年12月31日

改进FFT动态信号分析方法及在电力谐波检测中的应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员