We introduce an architecture, the Tensor Product Recurrent Network (TPRN). In our application of TPRN, internal representations learned by end-to-end optimization in a deep neural network performing a textual question-answering (QA) task can be interpreted using basic concepts from linguistic theory. No performance penalty need be paid for this increased interpretability: the proposed model performs comparably to a state-of-the-art system on the SQuAD QA task. The internal representation which is interpreted is a Tensor Product Representation: for each input word, the model selects a symbol to encode the word, and a role in which to place the symbol, and binds the two together. The selection is via soft attention. The overall interpretation is built from interpretations of the symbols, as recruited by the trained model, and interpretations of the roles as used by the model. We find support for our initial hypothesis that symbols can be interpreted as lexical-semantic word meanings, while roles can be interpreted as approximations of grammatical roles (or categories) such as subject, wh-word, determiner, etc. Fine-grained analysis reveals specific correspondences between the learned roles and parts of speech as assigned by a standard tagger (Toutanova et al. 2003), and finds several discrepancies in the model's favor. In this sense, the model learns significant aspects of grammar, after having been exposed solely to linguistically unannotated text, questions, and answers: no prior linguistic knowledge is given to the model. What is given is the means to build representations using symbols and roles, with an inductive bias favoring use of these in an approximately discrete manner.
翻译:我们引入了一个架构, 即 Tensor 产品经常网络( TPRN ) 。 在我们应用 TPRN 时, 在一个深层神经网络中通过端到端优化学习的内部表述, 执行文本问答(QA) 任务, 可以使用语言理论中的基本概念来解释。 不需要为这种增加解释性而支付任何性能处罚: 拟议的模型在 SQUAD QA 任务中可以与最先进的系统相对应。 内部表述被解释为 Tansor 产品代表 : 对于每个输入的单词, 模型选择一个符号来编码字词, 以及一个在深层神经网络中设置该符号的角色, 并且将两者捆绑在一起。 选择通过软的注意来解释。 总体解释源自对符号的解释, 由经过培训的模型所招聘的符号, 和对模型所使用的角色的解释, 我们得到支持, 符号可以被解读为 词汇式模型的词义, 而角色可以被仅仅解释为 给定的语法角色( 或类别) 的近似, 例如主题, hofor, 、 确定该符号, 和 将这两个角色绑 将两者中的一些 和 之前的符号的符号 使用一个特定的符号 。 Tochegragragragrade 。 在 学习中, 这些特定的 学习中, 这些 的符号 的符号 以在 学习中找到这些特定的符号和 。