实现与深等级强化学习的自动化平衡学习 (Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning)

Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class. Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class. While numerous over-sampling algorithms have been proposed, they heavily rely on heuristics, which could be sub-optimal since we may need different sampling strategies for different datasets and base classifiers, and they cannot directly optimize the performance metric. Motivated by this, we investigate developing a learning-based over-sampling algorithm to optimize the classification performance, which is a challenging task because of the huge and hierarchical decision space. At the high level, we need to decide how many synthetic samples to generate. At the low level, we need to determine where the synthetic samples should be located, which depends on the high-level decision since the optimal locations of the samples may differ for different numbers of samples. To address the challenges, we propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions. Motivated by the success of SMOTE~\cite{chawla2002smote} and its extensions, we formulate the generation process as a Markov decision process (MDP) consisting of three levels of policies to generate synthetic samples within the SMOTE search space. Then we leverage deep hierarchical reinforcement learning to optimize the performance metric on the validation data. Extensive experiments on six real-world datasets demonstrate that AutoSMOTE significantly outperforms the state-of-the-art resampling algorithms. The code is at https://github.com/daochenzha/autosmote

翻译：平衡学习是数据开采中的一项根本挑战,因为每个班级的培训样本比例不成比例。过度采样是一种有效的方法,通过为少数民族班级制作合成样本来解决不平衡的学习。虽然提出了无数的过度采样算法,但是它们严重依赖超光速算法,因为可能需要不同的数据集和基级分类器采用不同的取样战略,而且它们无法直接优化性能衡量标准。为此,我们调查开发一种基于学习的过度采样算法,以优化分类性能,这是一项艰巨的任务,因为由于决策空间巨大且等级分高。在高层次上,我们需要决定要生成多少合成样本。在低层次上,我们需要确定合成样本的位置,这取决于高层次决定,因为不同样本的最佳位置可能不同,因此它们无法直接优化性能衡量。为了应对挑战,我们建议AutoSMOTE,一个自动过度采样算的计算法,可以联合优化不同决策水平。在SMOTE-Sloiallical Sloveal Scial-Scialal-lax,我们在SMO-Smario-Smalal-deal-degradustrational-deal-deal-liversal slieval-deal-deal smaltraction smaltrading slax slaxxxx,我们在后,我们将生成的模拟的模拟的模拟的模拟数据学习了Smal-sxxxxxx。