DIFER: 可区别的自动化地物工程 (DIFER: Differentiable Automated Feature Engineering)

Feature engineering, a crucial step of machine learning, aims to extract useful features from raw data to improve data quality. In recent years, great efforts have been devoted to Automated Feature Engineering (AutoFE) to replace expensive human labor. However, existing methods are computationally demanding due to treating AutoFE as a coarse-grained black-box optimization problem over a discrete space. In this work, we propose an efficient gradient-based method called DIFER to perform differentiable automated feature engineering in a continuous vector space. DIFER selects potential features based on evolutionary algorithm and leverages an encoder-predictor-decoder controller to optimize existing features. We map features into the continuous vector space via the encoder, optimize the embedding along the gradient direction induced by the predicted score, and recover better features from the optimized embedding by the decoder. Extensive experiments on classification and regression datasets demonstrate that DIFER can significantly improve the performance of various machine learning algorithms and outperform current state-of-the-art AutoFE methods in terms of both efficiency and performance.

翻译：机械学习的关键一步,即功能工程,目的是从原始数据中提取有用的特征,以提高数据质量。近年来,为取代昂贵的人力劳动,对自动化功能工程(AutoFE)投入了大量努力。然而,由于将AutoFE作为离散空间上粗糙的黑盒优化问题处理,现有的方法在计算上要求很高。在这项工作中,我们提出了一个称为DIFER的高效梯度法,用于在连续矢量空间进行不同的自动特征工程。DIFER根据进化算法选择了潜在的特征,并利用一个编码器-前体-解密器控制器优化现有特征。我们通过编码器绘制连续矢量空间的特征,优化预测分数引出的梯度方向的嵌入,并从解码器优化的嵌入中恢复更好的特征。关于分类和回归数据集的广泛实验表明,DIFER能够大大提高各种机器学习算法的性能,并在效率和性能方面超越现有先进的自动FEU方法。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日