Feature engineering has become one of the most important steps to improve model prediction performance, and to produce quality datasets. However, this process requires non-trivial domain-knowledge which involves a time-consuming process. Thereby, automating such process has become an active area of research and of interest in industrial applications. In this paper, a novel method, called Meta-learning and Causality Based Feature Engineering (MACFE), is proposed; our method is based on the use of meta-learning, feature distribution encoding, and causality feature selection. In MACFE, meta-learning is used to find the best transformations, then the search is accelerated by pre-selecting "original" features given their causal relevance. Experimental evaluations on popular classification datasets show that MACFE can improve the prediction performance across eight classifiers, outperforms the current state-of-the-art methods in average by at least 6.54%, and obtains an improvement of 2.71% over the best previous works.
翻译:地物工程已成为改善模型预测性能和产生高质量数据集的最重要步骤之一。 然而,这一过程需要非三元域知识,这需要耗费时间的过程。 因此,这种过程的自动化已成为研究和工业应用中一个积极的研究和兴趣领域。在本文中,提出了一种叫作元学习和基于原因的地貌工程的新颖方法;我们的方法以使用元学习、特征分布编码和因果关系特征选择为基础。在MACFE中,元学习被用来寻找最佳的转变,然后通过预先选择“原生”特征来加速搜索。对流行分类数据集的实验评估表明,MACFE可以提高八个分类器的预测性能,平均比目前的最新方法至少高出6.54%,并且比以往的最佳方法改进2.71%。