Relational databases are the de facto standard for storing and querying structured data, and extracting insights from structured data requires advanced analytics. Deep neural networks (DNNs) have achieved super-human prediction performance in particular data types, e.g., images. However, existing DNNs may not produce meaningful results when applied to structured data. The reason is that there are correlations and dependencies across combinations of attribute values in a table, and these do not follow simple additive patterns that can be easily mimicked by a DNN. The number of possible such cross features is combinatorial, making them computationally prohibitive to model. Furthermore, the deployment of learning models in real-world applications has also highlighted the need for interpretability, especially for high-stakes applications, which remains another issue of concern to DNNs. In this paper, we present ARM-Net, an adaptive relation modeling network tailored for structured data, and a lightweight framework ARMOR based on ARM-Net for relational data analytics. The key idea is to model feature interactions with cross features selectively and dynamically, by first transforming the input features into exponential space, and then determining the interaction order and interaction weights adaptively for each cross feature. We propose a novel sparse attention mechanism to dynamically generate the interaction weights given the input tuple, so that we can explicitly model cross features of arbitrary orders with noisy features filtered selectively. Then during model inference, ARM-Net can specify the cross features being used for each prediction for higher accuracy and better interpretability. Our extensive experiments on real-world datasets demonstrate that ARM-Net consistently outperforms existing models and provides more interpretable predictions for data-driven decision making.
翻译:深神经网络(DNN)已经达到超人预测性能,具体数据类型,例如图像。然而,现有的DNNN在应用结构化数据时可能不会产生有意义的结果。原因是,在表格中,属性值的组合存在关联性和依赖性,而且它们不遵循可以很容易被 DNN 模拟的简单添加模式。这些可能的交叉功能数量是交织的,使得它们无法对模型进行计算。此外,在现实世界应用中部署学习模型也突出表明了解释性的必要性,特别是高取应用,这是DNNS关注的另一个问题。在本文件中,我们介绍了为结构化数据定制的适应性关系模型网络,以及基于ARMORM的简易模型框架,可以明确模拟数据流流的交叉解释。关键理念是,在选择和动态世界应用中,以选择性和动态的比重为模型,首先将我们使用的每个数据转换为滚动性模型,然后将我们不断调整的模型转换为滚动的滚动性模型,然后将我们使用的每个数据转换到滚动性模型,然后将不断的模型转换到滚动的滚动性模型。