利用有效的非线性诱因发现促进合成数据生成 (Boosting Synthetic Data Generation with Effective Nonlinear Causal Discovery) - 专知论文

会员服务 ·

0

Boosting（一种模型训练加速方式） · 数据集 · 相互独立的 · state-of-the-art · MINE ·

2023 年 1 月 18 日

Boosting Synthetic Data Generation with Effective Nonlinear Causal Discovery

翻译：利用有效的非线性诱因发现促进合成数据生成

Martina Cinquini,Fosca Giannotti,Riccardo Guidotti

Synthetic data generation has been widely adopted in software testing, data privacy, imbalanced learning, and artificial intelligence explanation. In all such contexts, it is crucial to generate plausible data samples. A common assumption of approaches widely used for data generation is the independence of the features. However, typically, the variables of a dataset depend on one another, and these dependencies are not considered in data generation leading to the creation of implausible records. The main problem is that dependencies among variables are typically unknown. In this paper, we design a synthetic dataset generator for tabular data that can discover nonlinear causalities among the variables and use them at generation time. State-of-the-art methods for nonlinear causal discovery are typically inefficient. We boost them by restricting the causal discovery among the features appearing in the frequent patterns efficiently retrieved by a pattern mining algorithm. We design a framework for generating synthetic datasets with known causalities to validate our proposal. Broad experimentation on many synthetic and real datasets with known causalities shows the effectiveness of the proposed method.

翻译：合成数据生成在软件测试、数据隐私、数据学习不平衡和人工智能解释中被广泛采用。在所有这些情况下,生成可信的数据样本至关重要。数据生成广泛采用的方法的共同假设是特性的独立性。然而,通常情况下,数据集的变量是相互依存的,在生成不可信的记录的数据过程中不考虑这些依赖性。主要问题是变量之间的依赖性通常不为人知。在本文中,我们设计了一个用于表格数据的合成数据集生成器,该数据集能够发现变量中的非线性因果关系,并在生成时使用这些数据。非线性因果关系发现的最新方法通常效率低下。我们通过限制模式采矿算法所有效检索的经常模式中出现的特征的因果关系发现,以此来推动这些特性。我们设计一个框架,用于生成已知因果关系的合成数据集,以验证我们的提议。对许多已知因果关系的合成和真实数据集进行广泛的实验,显示了拟议方法的有效性。

0

相关内容

Boosting（一种模型训练加速方式）

Boosting（一种模型训练加速方式）

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于PERK/elF2α通路研究针刺调控MCAO/R大鼠内质网应激-自噬稳态重构的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

定喘汤宣降清三法干预合胞病毒感染T-bet哮喘易感基因及代谢组学研究

国家自然科学基金

0+阅读 · 2014年12月31日

线粒体自噬-Warburg效应介导apelin促血管平滑肌细胞增殖

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

青蒿素抑制角膜血管新生的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

抗PSMA适配子介导的细胞自噬和凋亡对前列腺癌的靶向杀伤作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

c-Fos/AP-1促进TRAIL介导的前列腺癌细胞凋亡的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Identification and Estimation of Causal Effects with Confounders Missing Not at Random

Arxiv

0+阅读 · 2023年3月10日

GFlowCausal: Generative Flow Networks for Causal Discovery

Arxiv

0+阅读 · 2023年3月10日

Causal Confusion and Reward Misidentification in Preference-Based Reward Learning

Arxiv

0+阅读 · 2023年3月9日

Active Bayesian Causal Inference

Arxiv

14+阅读 · 2022年10月15日

A Review and Roadmap of Deep Learning Causal Discovery in Different Variable Paradigms

Arxiv

22+阅读 · 2022年9月14日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Arxiv

18+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

Boosting（一种模型训练加速方式）

相互独立的

state-of-the-art

相关VIP内容

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

空中的游戏规则改变者：无人机在俄乌战争中作为力量倍增器 | 2025最新文献

检索增强生成（RAG）技术，261页slides

战略无人机 | 2025最新80页

从DeepSeek-R1学到的三个核心经验

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Identification and Estimation of Causal Effects with Confounders Missing Not at Random

Arxiv

0+阅读 · 2023年3月10日

GFlowCausal: Generative Flow Networks for Causal Discovery

Arxiv

0+阅读 · 2023年3月10日

Causal Confusion and Reward Misidentification in Preference-Based Reward Learning

Arxiv

0+阅读 · 2023年3月9日

Active Bayesian Causal Inference

Arxiv

14+阅读 · 2022年10月15日

A Review and Roadmap of Deep Learning Causal Discovery in Different Variable Paradigms

Arxiv

22+阅读 · 2022年9月14日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Arxiv

18+阅读 · 2018年1月15日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于PERK/elF2α通路研究针刺调控MCAO/R大鼠内质网应激-自噬稳态重构的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

定喘汤宣降清三法干预合胞病毒感染T-bet哮喘易感基因及代谢组学研究

国家自然科学基金

0+阅读 · 2014年12月31日

线粒体自噬-Warburg效应介导apelin促血管平滑肌细胞增殖

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

青蒿素抑制角膜血管新生的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

抗PSMA适配子介导的细胞自噬和凋亡对前列腺癌的靶向杀伤作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

c-Fos/AP-1促进TRAIL介导的前列腺癌细胞凋亡的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员