利用强化学习选择功能 (Feature Selection Using Reinforcement Learning)

With the decreasing cost of data collection, the space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially. Therefore, identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model. In addition, identifying such features is critical for interpretability, prediction accuracy and optimal computation cost. While statistical methods such as subset selection, shrinkage, dimensionality reduction have been applied in selecting the best set of features, some other approaches in literature have approached feature selection task as a search problem where each state in the search space is a possible feature subset. In this paper, we solved the feature selection problem using Reinforcement Learning. Formulating the state space as a Markov Decision Process (MDP), we used Temporal Difference (TD) algorithm to select the best subset of features. Each state was evaluated using a robust and low cost classifier algorithm which could handle any non-linearities in the dataset.

翻译：随着数据收集成本的下降,可用于给特定感兴趣的预测者定性的变量或特征空间继续成倍增长。因此,确定最能将差异最小化而又不损害我们模型偏差的最典型特征对于成功培训机器学习模型至关重要。此外,确定这些特征对于解释性、预测准确性和最佳计算成本至关重要。虽然在选择最佳特征集时采用了子集选择、缩缩缩、维度降低等统计方法,但文献中其他一些方法将特征选择任务作为搜索问题处理,因为搜索空间的每一个州都是可能的特征子集。在本文中,我们用“强化学习”解决特征选择问题。将国家空间设计成Markov决策程序(MDP),我们使用“运动差异”算法选择最佳的特征组合。每个州都使用一种能够处理数据集中任何非线性的稳健和低成本分类算法进行评估。

相关内容

特征选择

关注 5931

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日