With the rapid growth of memory and computing power, datasets are becoming increasingly complex and imbalanced. This is especially severe in the context of clinical data, where there may be one rare event for many cases in the majority class. We introduce an imbalanced classification framework, based on reinforcement learning, for training extremely imbalanced data sets, and extend it for use in multi-class settings. We combine dueling and double deep Q-learning architectures, and formulate a custom reward function and episode-training procedure, specifically with the added capability of handling multi-class imbalanced training. Using real-world clinical case studies, we demonstrate that our proposed framework outperforms current state-of-the-art imbalanced learning methods, achieving more fair and balanced classification, while also significantly improving the prediction of minority classes.
翻译:随着记忆力和计算力的迅速增长,数据集正变得越来越复杂和不平衡,这在临床数据方面尤为严重,因为多数阶层的很多案例可能都会出现一个罕见的事件。我们引入了一个基于强化学习的不平衡分类框架,用于培训极端不平衡的数据集,并将其推广到多级环境中使用。我们结合了决赛和双重深层次的Q学习结构,并制定了一种定制奖励功能和分级培训程序,特别是处理多级不平衡培训的附加能力。我们利用现实世界临床案例研究,证明我们提议的框架优于目前最先进的不平衡学习方法,实现了更公平和平衡的分类,同时也大大改进了对少数民族班级的预测。