Developing neural architectures that are capable of logical reasoning has become increasingly important for a wide range of applications (e.g., natural language processing). Towards this grand objective, we first propose a symbolic reasoning architecture that chain FOET, which is particularly useful for modeling natural languages. To endow it with differentiable learning capability, we closely examine various neural operators for approximating the symbolic join-chains. Interestingly, we find that the widely used multi-head self-attention module in transformer can be understood as a special neural operator that implements the union bound of the join operator in probabilistic predicate space. Our analysis not only provides a new perspective on the mechanism of the pretrained models such as BERT for natural language understanding, but also suggests several important future improvement directions.
翻译:发展能够逻辑推理的神经结构对于广泛的应用(例如自然语言处理)越来越重要。为了实现这一宏伟目标,我们首先提出一个象征性的推理结构,将FOET连锁起来,这对自然语言的建模特别有用。为了赋予它不同的学习能力,我们仔细检查各种神经操作者如何接近象征性的连锁链。有趣的是,我们发现变压器中广泛使用的多头自省模块可以被理解为一个特殊的神经操作者,在概率性上游空间中执行联合操作者的结合约束。我们的分析不仅为诸如BERT等预先训练的模型机制提供了新的视角,用于理解自然语言,而且还提出了若干重要的未来改进方向。