DivAug: 具有明确多样化最大化的插件自动数据增强 (DivAug: Plug-in Automated Data Augmentation with Explicit Diversity Maximization)

Human-designed data augmentation strategies have been replaced by automatically learned augmentation policy in the past two years. Specifically, recent work has empirically shown that the superior performance of the automated data augmentation methods stems from increasing the diversity of augmented data \cite{autoaug, randaug}. However, two factors regarding the diversity of augmented data are still missing: 1) the explicit definition (and thus measurement) of diversity and 2) the quantifiable relationship between diversity and its regularization effects. To bridge this gap, we propose a diversity measure called Variance Diversity and theoretically show that the regularization effect of data augmentation is promised by Variance Diversity. We validate in experiments that the relative gain from automated data augmentation in test accuracy is highly correlated to Variance Diversity. An unsupervised sampling-based framework, \textbf{DivAug}, is designed to directly maximize Variance Diversity and hence strengthen the regularization effect. Without requiring a separate search process, the performance gain from DivAug is comparable with the state-of-the-art method with better efficiency. Moreover, under the semi-supervised setting, our framework can further improve the performance of semi-supervised learning algorithms compared to RandAugment, making it highly applicable to real-world problems, where labeled data is scarce. The code is available at \texttt{\url{https://github.com/warai-0toko/DivAug}}.

翻译：具体地说,最近的工作从经验上表明,自动化数据增强方法的优异性能来自增强数据多样性的日益多样化。但是,关于数据增加多样性的两个因素仍然缺乏:(1) 多样性的明确定义(并因此衡量),(2) 多样性及其规范效果之间的量化关系。为了缩小这一差距,我们提议了一项多样性措施,称为差异多样性,理论上表明数据增加的正规化效应是差异多样性的许诺。我们在试验中证实,测试精度自动数据增强的相对收益与差异多样性高度相关。一个未经监督的抽样基础框架,\ textbf{DivAug}旨在直接最大限度地扩大差异多样性,从而加强规范效应。在不要求单独搜索程序的情况下,DivAug的绩效收益与最新技术方法相比,效率更高。此外,在半监督的环境下,我们的框架可以进一步改进半超强的测试数据增强与差异多样性高度相关。在Randworld上,在可与真实的标签/版本数据相比,在高可应用的高度可应用性算法方面,DivA。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ECML-PKDD 2019】基于邻域增强LSTM模型的出租车乘客需求预测（A Neighborhood-augmented LSTM Model for Taxi-Passenger Demand Prediction）

专知会员服务

21+阅读 · 2019年12月1日