PAC 在转型变化中的可学习性理论 (A Theory of PAC Learnability under Transformation Invariances) - 专知论文

会员服务 ·

0

变换 · PAC学习理论 · 数据增强 · 优化器 · Performer ·

2022 年 2 月 15 日

A Theory of PAC Learnability under Transformation Invariances

翻译：PAC 在转型变化中的可学习性理论

Han Shao,Omar Montasser,Avrim Blum

Transformation invariances are present in many real-world problems. For example, image classification is usually invariant to rotation and color transformation: a rotated car in a different color is still identified as a car. Data augmentation, which adds the transformed data into the training set and trains a model on the augmented data, is one commonly used technique to build these invariances into the learning process. However, it is unclear how data augmentation performs theoretically and what the optimal algorithm is in presence of transformation invariances. In this paper, we study PAC learnability under transformation invariances in three settings according to different levels of realizability: (i) A hypothesis fits the augmented data; (ii) A hypothesis fits only the original data and the transformed data lying in the support of the data distribution; (iii) Agnostic case. One interesting observation is that distinguishing between the original data and the transformed data is necessary to achieve optimal accuracy in setting (ii) and (iii), which implies that any algorithm not differentiating between the original and transformed data (including data augmentation) is not optimal. Furthermore, this type of algorithms can even "harm" the accuracy. In setting (i), although it is unnecessary to distinguish between the two data sets, data augmentation still does not perform optimally. Due to such a difference, we propose two combinatorial measures characterizing the optimal sample complexity in setting (i) and (ii)(iii) and provide the optimal algorithms.

翻译：在许多现实世界的问题中存在变异。例如,图像分类通常不易于旋转和颜色变异。图像分类通常与旋转和颜色变异有关: 以不同颜色旋转的汽车仍然被识别为汽车。数据增强将转换的数据添加到培训数据集中,并培训关于扩大数据模型的模型,是将这些变异添加到学习过程中的一种常用技术。但是,数据增强如何在理论上发挥作用,以及哪种最佳算法在变异情况下是最佳算法。在本文中,我们根据不同程度的变异来研究三种情况下变异中PAC的易懂性: (一) 假设适合扩大的数据;(二) 假设仅适合原始数据,以及支持数据分配过程中的变异数据;(三) Amnistic 案例。一个有趣的观察是,区分原始数据和变异变数据在设定(二) 和(三) 意味着任何不区分原始和变异数据的算法(包括数据增异化)不是最佳的。此外,这种算法甚至“ 最优化的变异性” 在设定数据时,我们提出这种变异性( 最不必要地) 将数据排序中, 进行这种变异化的变变的变异性变异性(我们提出数据( ) 进行这种变异性变的变的变的变的变的变更的变的变的变式) 。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

几类含∞-Laplace算子的特征值问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于分数阶统计建模的低剂量CT优质成像新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

约束集值优化问题的适定性研究及相关分析

国家自然科学基金

0+阅读 · 2013年12月31日

基于空间-分数谱域联合稀疏表示的SAR图像目标识别

国家自然科学基金

0+阅读 · 2013年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

高分辨率SAR复杂场景建模与基于场景指引的目标检测

国家自然科学基金

1+阅读 · 2009年12月31日

SAR图像二次成像

国家自然科学基金

5+阅读 · 2008年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

Improved Worst-Group Robustness via Classifier Retraining on Independent Splits

Arxiv

0+阅读 · 2022年4月20日

Choosing the number of factors in factor analysis with incomplete data via a hierarchical Bayesian information criterion

Arxiv

0+阅读 · 2022年4月19日

When Is Partially Observable Reinforcement Learning Not Scary?

Arxiv

0+阅读 · 2022年4月19日

Separating Rule Discovery and Global Solution Composition in a Learning Classifier System

Arxiv

0+阅读 · 2022年4月18日

Transfer Learning under High-dimensional Generalized Linear Models

Arxiv

0+阅读 · 2022年4月17日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Domain Generalization in Vision: A Survey

Arxiv

16+阅读 · 2021年7月18日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

VIP会员

文章信息

相关主题

PAC学习理论

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Improved Worst-Group Robustness via Classifier Retraining on Independent Splits

Arxiv

0+阅读 · 2022年4月20日

Choosing the number of factors in factor analysis with incomplete data via a hierarchical Bayesian information criterion

Arxiv

0+阅读 · 2022年4月19日

When Is Partially Observable Reinforcement Learning Not Scary?

Arxiv

0+阅读 · 2022年4月19日

Separating Rule Discovery and Global Solution Composition in a Learning Classifier System

Arxiv

0+阅读 · 2022年4月18日

Transfer Learning under High-dimensional Generalized Linear Models

Arxiv

0+阅读 · 2022年4月17日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Domain Generalization in Vision: A Survey

Arxiv

16+阅读 · 2021年7月18日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

相关基金

几类含∞-Laplace算子的特征值问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于分数阶统计建模的低剂量CT优质成像新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

约束集值优化问题的适定性研究及相关分析

国家自然科学基金

0+阅读 · 2013年12月31日

基于空间-分数谱域联合稀疏表示的SAR图像目标识别

国家自然科学基金

0+阅读 · 2013年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

高分辨率SAR复杂场景建模与基于场景指引的目标检测

国家自然科学基金

1+阅读 · 2009年12月31日

SAR图像二次成像

国家自然科学基金

5+阅读 · 2008年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员