实现高效自动化地物工程 (Toward Efficient Automated Feature Engineering)

Automated Feature Engineering (AFE) refers to automatically generate and select optimal feature sets for downstream tasks, which has achieved great success in real-world applications. Current AFE methods mainly focus on improving the effectiveness of the produced features, but ignoring the low-efficiency issue for large-scale deployment. Therefore, in this work, we propose a generic framework to improve the efficiency of AFE. Specifically, we construct the AFE pipeline based on reinforcement learning setting, where each feature is assigned an agent to perform feature transformation \com{and} selection, and the evaluation score of the produced features in downstream tasks serve as the reward to update the policy. We improve the efficiency of AFE in two perspectives. On the one hand, we develop a Feature Pre-Evaluation (FPE) Model to reduce the sample size and feature size that are two main factors on undermining the efficiency of feature evaluation. On the other hand, we devise a two-stage policy training strategy by running FPE on the pre-evaluation task as the initialization of the policy to avoid training policy from scratch. We conduct comprehensive experiments on 36 datasets in terms of both classification and regression tasks. The results show $2.9\%$ higher performance in average and 2x higher computational efficiency comparing to state-of-the-art AFE methods.

翻译：自动化地物工程(AFE)是指为下游任务自动生成和选择最佳功能组,这些功能组在现实应用中取得了巨大成功。目前的亚FE方法主要侧重于提高所产生特征的效能,但忽视了大规模部署的低效率问题。因此,在这项工作中,我们提出了一个提高亚地物工程效率的一般性框架。具体地说,我们根据强化学习环境建造了亚地物管道,其中每个特征都指定了一个代理进行特征转换\com{and}选择,下游任务所产生特征的评价分数是更新政策的奖励。我们从两个角度提高亚地物力组的效率。一方面,我们开发了一个特征预估评前模型(FPE),以降低样本大小和特征大小,这是削弱地物评估效率的两个主要因素。另一方面,我们设计了两阶段的政策培训战略,将FPEPE作为评估前任务的初始化,以避免培训政策从零开始。我们从两个角度对36个数据组进行了全面实验。我们从更高分类和回归任务两方面对成果进行了比较。Asalalalalal 和Axx ealalalalalalalalalalalalal 2.9_xxxxxxxxxxxxxx 计算结果。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日