Excel Fremer:一个神经网络网络,在表格数据上取代GBDT (ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data)

Though neural networks have achieved enormous breakthroughs on various fields (e.g., computer vision) in supervised learning, they still trailed the performances of GBDTs on tabular data thus far. Delving into this issue, we identify that a proper handling of feature interactions and feature embedding is crucial to the success of neural networks on tabular data. We develop a novel neural network called ExcelFormer, which alternates in turn two attention modules that respectively manipulate careful feature interactions and feature embedding updates. A bespoke training methodology is jointly introduced to facilitate the model performances. By initializing parameters with minuscule values, these attention modules are attenuated when the training begins, and the effects of feature interactions and embedding updates progressively grow up to optimum levels under the guidance of the proposed specific regularization approaches Swap-Mix and Hidden-Mix as the training proceeds. Experiments on 25 public tabular datasets show that our ExcelFormer is superior to extremely-tuned GBDTs, which is an unprecedented achievement of neural networks in supervised tabular learning. The codes are available at https://github.com/WhatAShot/ExcelFormer.

翻译：尽管神经网络在受监督的学习领域(如计算机愿景)取得了巨大的突破,但它们迄今仍然在表格数据上跟踪了GBDTs的绩效。我们研究这一问题时发现,正确处理特征互动和嵌入特征对于在表格数据上神经网络的成功至关重要。我们开发了一个叫ExcelFormer的新型神经网络,它反过来又将分别操作谨慎特征互动和嵌入特征更新的两个关注模块相交为交替。为了便利模型性能,我们联合采用了一种简单的培训方法。通过初始化带有微值的参数,这些关注模块在培训开始时就被加速了,在拟议的具体规范方法Swap-Mix和Hide-Mix的指导下,功能互动和嵌入更新的影响逐渐提高到最佳水平。在25个公共表格数据集上进行的实验显示,我们的ExFormer软件优于经过极大调的GBDTs。这是在监督的表格学习中神经网络前所未有的成就。代码见https://github.com/WheAShot/ExcelFormer。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日