Feature transformation for AI is an essential task to boost the effectiveness and interpretability of machine learning (ML). Feature transformation aims to transform original data to identify an optimal feature space that enhances the performances of a downstream ML model. Existing studies either combines preprocessing, feature selection, and generation skills to empirically transform data, or automate feature transformation by machine intelligence, such as reinforcement learning. However, existing studies suffer from: 1) high-dimensional non-discriminative feature space; 2) inability to represent complex situational states; 3) inefficiency in integrating local and global feature information. To fill the research gap, we formulate the feature transformation task as an iterative, nested process of feature generation and selection, where feature generation is to generate and add new features based on original features, and feature selection is to remove redundant features to control the size of feature space. Finally, we present extensive experiments and case studies to illustrate 24.7\% improvements in F1 scores compared with SOTAs and robustness in high-dimensional data.
翻译:光学协会的地貌转型是提高机器学习有效性和可解释性的一项基本任务。地貌转型的目的是改造原始数据,以确定最佳的特征空间,提高下游ML模型的性能。现有的研究或者将预处理、地物选择和生成技能结合起来,以便根据经验转换数据,或者通过机器智能实现地物转换自动化,例如强化学习。然而,现有研究的难度在于:(1)高维非差异性地物空间;(2)无法代表复杂的情况状态;(3)整合本地和全球地物信息的效率低下。为了填补研究空白,我们将地物转换任务设计成一个迭接的、嵌套的地物生成和选择过程,根据原始地物生成和添加新的特征,而地物选择是为了消除冗余的特征,以控制地物空间的大小。最后,我们提出广泛的实验和案例研究,以说明F1分数比SOTA分数和高维度数据强度的24.7 ⁇ 改进。