Tabular data are ubiquitous in real world applications. Although many commonly-used neural components (e.g., convolution) and extensible neural networks (e.g., ResNet) have been developed by the machine learning community, few of them were effective for tabular data and few designs were adequately tailored for tabular data structures. In this paper, we propose a novel and flexible neural component for tabular data, called Abstract Layer (AbstLay), which learns to explicitly group correlative input features and generate higher-level features for semantics abstraction. Also, we design a structure re-parameterization method to compress AbstLay, thus reducing the computational complexity by a clear margin in the reference phase. A special basic block is built using AbstLays, and we construct a family of Deep Abstract Networks (DANets) for tabular data classification and regression by stacking such blocks. In DANets, a special shortcut path is introduced to fetch information from raw tabular features, assisting feature interactions across different levels. Comprehensive experiments on seven real-world tabular datasets show that our AbstLay and DANets are effective for tabular data classification and regression, and the computational complexity is superior to competitive methods. Besides, we evaluate the performance gains of DANet as it goes deep, verifying the extendibility of our method. Our code is available at https://github.com/WhatAShot/DANet.
翻译:尽管机器学习界开发了许多常用神经元件(例如,混凝土)和可扩展神经网络(例如,ResNet),但其中很少对表格数据有效,也很少设计适合表格数据结构。在本文中,我们为表格数据提出了一个新颖和灵活的神经元件,称为“摘要图”(AbstLay),它学习明确组合相关输入特性,为语义抽象生成更高层次的特征。此外,我们还设计了一个结构重新参数化方法,以压缩AbstLay,从而在参考阶段将计算复杂性降低一个明确的差幅。一个特殊的基本元件是用AbstLays建造的,而没有为表格数据分类和回归而适当定制。在DANets,引入了一条特殊的捷径路径,从原始表格特征中获取信息,协助不同层次的特征互动。我们在7个真实世界的表格数据集中进行了全面的实验,显示我们AbsthoL/DANet的升级方法是我们现有的数据变压和变压方法。