In this work, we describe a full-stack pipeline for natural language processing on near-term quantum computers, aka QNLP. The language-modelling framework we employ is that of compositional distributional semantics (DisCoCat), which extends and complements the compositional structure of pregroup grammars. Within this model, the grammatical reduction of a sentence is interpreted as a diagram, encoding a specific interaction of words according to the grammar. It is this interaction which, together with a specific choice of word embedding, realises the meaning (or "semantics") of a sentence. Building on the formal quantum-like nature of such interactions, we present a method for mapping DisCoCat diagrams to quantum circuits. Our methodology is compatible both with NISQ devices and with established Quantum Machine Learning techniques, paving the way to near-term applications of quantum technology to natural language processing.
翻译:在这项工作中,我们描述了近期量子计算机(aka QNLP)自然语言处理的完整管道。我们使用的语言模型框架是组成分布语义(DisCoCat),它扩展和补充了组前语法的构成结构。在这个模型中,一个句子的语法缩写被解释为一个图表,根据语法对词的具体互动进行编码。正是这种互动,加上一个特定的词嵌入选择,认识到了一个句子的含义(或“语义 ” )。在这种互动的正式量子性质的基础上,我们提出了一个绘制DisCoCat图到量子电路的方法。我们的方法既符合NISQ装置,也符合既定的量子机器学习技术,为量子技术在自然语言处理方面的近期应用铺平了道路。