MixRL: 利用强化学习促进递减的数据混合增量 (MixRL: Data Mixing Augmentation for Regression using Reinforcement Learning)

Data augmentation is becoming essential for improving regression accuracy in critical applications including manufacturing and finance. Existing techniques for data augmentation largely focus on classification tasks and do not readily apply to regression tasks. In particular, the recent Mixup techniques for classification rely on the key assumption that linearity holds among training examples, which is reasonable if the label space is discrete, but has limitations when the label space is continuous as in regression. We show that mixing examples that either have a large data or label distance may have an increasingly-negative effect on model performance. Hence, we use the stricter assumption that linearity only holds within certain data or label distances for regression where the degree may vary by each example. We then propose MixRL, a data augmentation meta learning framework for regression that learns for each example how many nearest neighbors it should be mixed with for the best model performance using a small validation set. MixRL achieves these objectives using Monte Carlo policy gradient reinforcement learning. Our experiments conducted both on synthetic and real datasets show that MixRL significantly outperforms state-of-the-art data augmentation baselines. MixRL can also be integrated with other classification Mixup techniques for better results.

翻译：增加数据对于提高关键应用(包括制造和融资)的回归准确性至关重要。现有的数据增强技术主要侧重于分类任务,并不轻易适用于回归任务。特别是,最近的混合分类技术依赖于关键假设,即在培训实例中存在线性,如果标签空间是离散的,这是合理的,但如果标签空间是连续的,则有限制,与回归一样,标签空间是连续的。我们显示,如果将具有较大数据或标签距离的示例混在一起,可能对模型性能产生越来越消极的影响。因此,我们使用更严格的假设,即线性只存在于某些数据或标签距离内,而回归程度可能因每个例子而不同。我们然后提议采用MixRL, 数据增强元性元性元性学习框架, 以学习如何将它与使用小的校准集的最佳模型性能混合起来。 MixRL 利用蒙特卡洛政策梯度强化学习实现这些目标。我们在合成和真实数据集上进行的实验显示, MixRL 明显超越了数据增强状态的基线。 MixRL 还可以与其他分类方法相结合, 。