Automated Feature Engineering (AFE) refers to automatically generate and select optimal feature sets for downstream tasks, which has achieved great success in real-world applications. Current AFE methods mainly focus on improving the effectiveness of the produced features, but ignoring the low-efficiency issue for large-scale deployment. Therefore, in this work, we propose a generic framework to improve the efficiency of AFE. Specifically, we construct the AFE pipeline based on reinforcement learning setting, where each feature is assigned an agent to perform feature transformation \com{and} selection, and the evaluation score of the produced features in downstream tasks serve as the reward to update the policy. We improve the efficiency of AFE in two perspectives. On the one hand, we develop a Feature Pre-Evaluation (FPE) Model to reduce the sample size and feature size that are two main factors on undermining the efficiency of feature evaluation. On the other hand, we devise a two-stage policy training strategy by running FPE on the pre-evaluation task as the initialization of the policy to avoid training policy from scratch. We conduct comprehensive experiments on 36 datasets in terms of both classification and regression tasks. The results show $2.9\%$ higher performance in average and 2x higher computational efficiency comparing to state-of-the-art AFE methods.
翻译:自动化地物工程(AFE)是指为下游任务自动生成和选择最佳功能组,这些功能组在现实应用中取得了巨大成功。目前的亚FE方法主要侧重于提高所产生特征的效能,但忽视了大规模部署的低效率问题。因此,在这项工作中,我们提出了一个提高亚地物工程效率的一般性框架。具体地说,我们根据强化学习环境建造了亚地物管道,其中每个特征都指定了一个代理进行特征转换\com{and}选择,下游任务所产生特征的评价分数是更新政策的奖励。我们从两个角度提高亚地物力组的效率。一方面,我们开发了一个特征预估评前模型(FPE),以降低样本大小和特征大小,这是削弱地物评估效率的两个主要因素。另一方面,我们设计了两阶段的政策培训战略,将FPEPE作为评估前任务的初始化,以避免培训政策从零开始。我们从两个角度对36个数据组进行了全面实验。我们从更高分类和回归任务两方面对成果进行了比较。Asalalalalal 和Axx ealalalalalalalalalalalalal 2.9_xxxxxxxxxxxxxx 计算结果。