MoCoDA:基于模型的反事实数据增强 (MoCoDA: Model-based Counterfactual Data Augmentation) - 专知论文

会员服务 ·

0

分解的 · Agent · MoDELS · Learning · 数据增强 ·

2022 年 10 月 20 日

MoCoDA: Model-based Counterfactual Data Augmentation

翻译：MoCoDA:基于模型的反事实数据增强

Silviu Pitis,Elliot Creager,Ajay Mandlekar,Animesh Garg

from arxiv, In Proceedings of NeurIPS 2022. 10 pages (+3 references, +10 appendix). Code available at https://github.com/spitis/mocoda

The number of states in a dynamic process is exponential in the number of objects, making reinforcement learning (RL) difficult in complex, multi-object domains. For agents to scale to the real world, they will need to react to and reason about unseen combinations of objects. We argue that the ability to recognize and use local factorization in transition dynamics is a key element in unlocking the power of multi-object reasoning. To this end, we show that (1) known local structure in the environment transitions is sufficient for an exponential reduction in the sample complexity of training a dynamics model, and (2) a locally factored dynamics model provably generalizes out-of-distribution to unseen states and actions. Knowing the local structure also allows us to predict which unseen states and actions this dynamics model will generalize to. We propose to leverage these observations in a novel Model-based Counterfactual Data Augmentation (MoCoDA) framework. MoCoDA applies a learned locally factored dynamics model to an augmented distribution of states and actions to generate counterfactual transitions for RL. MoCoDA works with a broader set of local structures than prior work and allows for direct control over the augmented training distribution. We show that MoCoDA enables RL agents to learn policies that generalize to unseen states and actions. We use MoCoDA to train an offline RL agent to solve an out-of-distribution robotics manipulation task on which standard offline RL algorithms fail.

翻译：动态进程中的国家数量在物体数量上呈指数化,使强化学习(RL)在复杂多球域中难以在复杂多球域中进行,使强化学习(RL)难于在复杂多球域中进行。对于向真实世界扩展的代理商来说,它们将需要对不可见的物体组合作出反应和理性。我们认为,在过渡动态中承认和使用当地因素化的能力是释放多球推理能力的一个关键要素。为此,我们显示:(1) 环境转型中已知的当地结构足以使培训动态模型的抽样复杂性急剧减少,(2) 当地因素化动态模型可以被可调和地概括到向隐形国家和行动传播。了解当地结构还使我们能够预测哪些不可见的物体和这种动态模型将概括起来。我们提议在基于模型的反事实数据放大框架中利用这些观测结果。 MOCODA应用一个学习当地因素的动态模型模型来扩大各州的分布和行动,以产生反事实性转变,为RL。 MODA与比先前的工作和行动的分布允许直接控制RDA的RA工具进行升级。

0

相关内容

分解的

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Klystron效应作用下的周期性雾化机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于种群动力学特征探讨原生动物对海洋近岸水体营养盐变化的响应机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于晶格点缺陷的二维Frenkel-Kontorova模型耗散动力学研究

国家自然科学基金

0+阅读 · 2013年12月31日

海水环境下镍铝青铜合金的滑动摩擦-腐蚀行为及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

激光烧蚀过程中电子热输运的Fokker-Planck与流体模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

中国不同类型区低碳发展机理与系统仿真研究

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于地表能量平衡与人为热排放遥感估算的城市能源效率空间分异模型

国家自然科学基金

0+阅读 · 2012年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

Continual Learning with Distributed Optimization: Does CoCoA Forget?

Continual Learning with Distributed Optimization: Does CoCoA Forget?

Arxiv

0+阅读 · 2022年12月2日

Taking a Step Back with KCal: Multi-Class Kernel-Based Calibration for Deep Neural Networks

Arxiv

0+阅读 · 2022年12月2日

Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Arxiv

0+阅读 · 2022年12月1日

CONDA: Continual Unsupervised Domain Adaptation Learning in Visual Perception for Self-Driving Cars

Arxiv

0+阅读 · 2022年12月1日

Sampling numbers of smoothness classes via $\ell^1$-minimization

Arxiv

0+阅读 · 2022年12月1日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Counterfactual Zero-Shot and Open-Set Visual Recognition

Arxiv

12+阅读 · 2021年3月1日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于场景分解盲反卷积与神经网络的多帧天文图像恢复技术》278页博士论文

深度学习中泛化的量化、理解与改进

《DARPA“数字射频战场模拟器”（DRBE）项目射频接口概念开发》50页技术报告

《美军条令：反叛乱作战技战术》2025最新450页

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Continual Learning with Distributed Optimization: Does CoCoA Forget?

Continual Learning with Distributed Optimization: Does CoCoA Forget?

Arxiv

0+阅读 · 2022年12月2日

Taking a Step Back with KCal: Multi-Class Kernel-Based Calibration for Deep Neural Networks

Arxiv

0+阅读 · 2022年12月2日

Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs

Arxiv

0+阅读 · 2022年12月1日

CONDA: Continual Unsupervised Domain Adaptation Learning in Visual Perception for Self-Driving Cars

Arxiv

0+阅读 · 2022年12月1日

Sampling numbers of smoothness classes via $\ell^1$-minimization

Arxiv

0+阅读 · 2022年12月1日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Arxiv

40+阅读 · 2021年9月15日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Model-Contrastive Federated Learning

Arxiv

10+阅读 · 2021年3月30日

Counterfactual Zero-Shot and Open-Set Visual Recognition

Arxiv

12+阅读 · 2021年3月1日

Distributed Machine Learning on Mobile Devices: A Survey

Distributed Machine Learning on Mobile Devices: A Survey

Arxiv

37+阅读 · 2019年9月18日

相关基金

Klystron效应作用下的周期性雾化机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于种群动力学特征探讨原生动物对海洋近岸水体营养盐变化的响应机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于晶格点缺陷的二维Frenkel-Kontorova模型耗散动力学研究

国家自然科学基金

0+阅读 · 2013年12月31日

海水环境下镍铝青铜合金的滑动摩擦-腐蚀行为及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

激光烧蚀过程中电子热输运的Fokker-Planck与流体模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

中国不同类型区低碳发展机理与系统仿真研究

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于地表能量平衡与人为热排放遥感估算的城市能源效率空间分异模型

国家自然科学基金

0+阅读 · 2012年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员