AEFE: 分类特征自动嵌入式地形工程 (AEFE: Automatic Embedded Feature Engineering for Categorical Features) - 专知论文

会员服务 ·

0

MINE · Engineering · Performer · MoDELS · 数据挖掘 ·

2021 年 10 月 19 日

AEFE: Automatic Embedded Feature Engineering for Categorical Features

翻译：AEFE: 分类特征自动嵌入式地形工程

Zhenyuan Zhong,Jie Yang,Yacong Ma,Shoubin Dong,Jinlong Hu

from arxiv, 24 pages, 6 figures, 13 tables

The challenge of solving data mining problems in e-commerce applications such as recommendation system (RS) and click-through rate (CTR) prediction is how to make inferences by constructing combinatorial features from a large number of categorical features while preserving the interpretability of the method. In this paper, we propose Automatic Embedded Feature Engineering(AEFE), an automatic feature engineering framework for representing categorical features, which consists of various components including custom paradigm feature construction and multiple feature selection. By selecting the potential field pairs intelligently and generating a series of interpretable combinatorial features, our framework can provide a set of unseen generated features for enhancing model performance and then assist data analysts in discovering the feature importance for particular data mining tasks. Furthermore, AEFE is distributed implemented by task-parallelism, data sampling, and searching schema based on Matrix Factorization field combination, to optimize the performance and enhance the efficiency and scalability of the framework. Experiments conducted on some typical e-commerce datasets indicate that our method outperforms the classical machine learning models and state-of-the-art deep learning models.

翻译：解决电子商务应用中的数据开采问题,例如建议系统(RS)和点击通速率预测(CTR)的难题是如何通过从大量绝对特征中建立组合特征来作出推论,同时保留该方法的可解释性;在本文件中,我们提议一个自动嵌入式地物工程(AEFE),这是一个代表绝对特征的自动地物工程框架,由各种组成部分组成,包括定制范式的构建和多重特征选择;通过明智地选择潜在的字段配对,并产生一系列可解释的组合特征,我们的框架可以提供一套无形生成的特征,用于提高模型性能,然后协助数据分析人员发现特定数据开采任务的特点的重要性;此外,AEFE通过任务共性、数据取样和基于矩阵集成场组合的搜索模型进行分配,以优化性能,提高框架的效率和可缩放度;对一些典型的电子商务数据集进行的实验表明,我们的方法优于典型的机器学习模型和最先进的深层学习模型。

0

相关内容

MINE

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【论文推荐】数据科学中有关矩阵方法的文献综述：A LITERATURE SURVEY OF MATRIX METHODS FOR DATASCIENCE

【论文推荐】数据科学中有关矩阵方法的文献综述：A LITERATURE SURVEY OF MATRIX METHODS FOR DATASCIENCE

专知会员服务

25+阅读 · 2019年12月19日

【ECML-PKDD 2019】带歧义的分类变量编码（Encoding Categorical Variables with Ambiguity）

【ECML-PKDD 2019】带歧义的分类变量编码（Encoding Categorical Variables with Ambiguity）

专知会员服务

5+阅读 · 2019年12月1日

【AAAI Tutorials 2019】为大数据平台构建深度学习应用程序（Building Deep Learning Applications for Big Data Platforms）

【AAAI Tutorials 2019】为大数据平台构建深度学习应用程序（Building Deep Learning Applications for Big Data Platforms）

专知会员服务

10+阅读 · 2019年11月18日

【南洋理工】区块链综述，25页pdf，Blockchain for Future Smart Grid: A Comprehensive Survey

【南洋理工】区块链综述，25页pdf，Blockchain for Future Smart Grid: A Comprehensive Survey

专知会员服务

38+阅读 · 2019年11月12日

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

专知会员服务

62+阅读 · 2019年10月26日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

已删除

将门创投

3+阅读 · 2019年1月8日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自动特征工程开源框架

【推荐】自动特征工程开源框架

机器学习研究会

17+阅读 · 2017年11月7日

LibRec 每周算法：NFM (SIGIR'17)

LibRec 每周算法：NFM (SIGIR'17)

LibRec智能推荐

8+阅读 · 2017年10月17日

MOBAFS: A Multi Objective Bee Algorithm for Feature subset selection in Software Product Lines

Arxiv

0+阅读 · 2021年12月10日

Feature matching for multi-epoch historical aerial images

Arxiv

0+阅读 · 2021年12月8日

CAN: Feature Co-Action for Click-Through Rate Prediction

Arxiv

0+阅读 · 2021年12月6日

Learning to Embed Categorical Features without Embedding Tables for Recommendation

Arxiv

7+阅读 · 2021年6月7日

Exploring Categorical Regularization for Domain Adaptive Object Detection

Exploring Categorical Regularization for Domain Adaptive Object Detection

Arxiv

5+阅读 · 2020年3月20日

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

Arxiv

7+阅读 · 2019年4月16日

A Framework of Transfer Learning in Object Detection for Embedded Systems

Arxiv

3+阅读 · 2018年11月12日

Efficient and Effective $L_0$ Feature Selection

Efficient and Effective $L_0$ Feature Selection

Arxiv

5+阅读 · 2018年8月7日

xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems

Arxiv

9+阅读 · 2018年5月30日

Learning Region Features for Object Detection

Arxiv

4+阅读 · 2018年3月19日

VIP会员

文章信息

相关主题

相关VIP内容

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

【论文推荐】数据科学中有关矩阵方法的文献综述：A LITERATURE SURVEY OF MATRIX METHODS FOR DATASCIENCE

【论文推荐】数据科学中有关矩阵方法的文献综述：A LITERATURE SURVEY OF MATRIX METHODS FOR DATASCIENCE

专知会员服务

25+阅读 · 2019年12月19日

【ECML-PKDD 2019】带歧义的分类变量编码（Encoding Categorical Variables with Ambiguity）

【ECML-PKDD 2019】带歧义的分类变量编码（Encoding Categorical Variables with Ambiguity）

专知会员服务

5+阅读 · 2019年12月1日

【AAAI Tutorials 2019】为大数据平台构建深度学习应用程序（Building Deep Learning Applications for Big Data Platforms）

【AAAI Tutorials 2019】为大数据平台构建深度学习应用程序（Building Deep Learning Applications for Big Data Platforms）

专知会员服务

10+阅读 · 2019年11月18日

【南洋理工】区块链综述，25页pdf，Blockchain for Future Smart Grid: A Comprehensive Survey

【南洋理工】区块链综述，25页pdf，Blockchain for Future Smart Grid: A Comprehensive Survey

专知会员服务

38+阅读 · 2019年11月12日

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

面向机器学习和数据分析的特征工程（Feature Engineering for Machine Learning and Data Analytics），附新书419页pdf

专知会员服务

62+阅读 · 2019年10月26日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

已删除

将门创投

3+阅读 · 2019年1月8日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】自动特征工程开源框架

【推荐】自动特征工程开源框架

机器学习研究会

17+阅读 · 2017年11月7日

LibRec 每周算法：NFM (SIGIR'17)

LibRec 每周算法：NFM (SIGIR'17)

LibRec智能推荐

8+阅读 · 2017年10月17日

相关论文

MOBAFS: A Multi Objective Bee Algorithm for Feature subset selection in Software Product Lines

Arxiv

0+阅读 · 2021年12月10日

Feature matching for multi-epoch historical aerial images

Arxiv

0+阅读 · 2021年12月8日

CAN: Feature Co-Action for Click-Through Rate Prediction

Arxiv

0+阅读 · 2021年12月6日

Learning to Embed Categorical Features without Embedding Tables for Recommendation

Arxiv

7+阅读 · 2021年6月7日

Exploring Categorical Regularization for Domain Adaptive Object Detection

Exploring Categorical Regularization for Domain Adaptive Object Detection

Arxiv

5+阅读 · 2020年3月20日

NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

Arxiv

7+阅读 · 2019年4月16日

A Framework of Transfer Learning in Object Detection for Embedded Systems

Arxiv

3+阅读 · 2018年11月12日

Efficient and Effective $L_0$ Feature Selection

Efficient and Effective $L_0$ Feature Selection

Arxiv

5+阅读 · 2018年8月7日

xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems

Arxiv

9+阅读 · 2018年5月30日

Learning Region Features for Object Detection

Arxiv

4+阅读 · 2018年3月19日

微信扫码咨询专知VIP会员