This paper describes experiments showing that some tasks in natural language processing (NLP) can already be performed using quantum computers, though so far only with small datasets. We demonstrate various approaches to topic classification. The first uses an explicit word-based approach, in which word-topic scoring weights are implemented as fractional rotations of individual qubit, and a new phrase is classified based on the accumulation of these weights in a scoring qubit using entangling controlled-NOT gates. This is compared with more scalable quantum encodings of word embedding vectors, which are used in the computation of kernel values in a quantum support vector machine: this approach achieved an average of 62% accuracy on classification tasks involving over 10000 words, which is the largest such quantum computing experiment to date. We describe a quantum probability approach to bigram modeling that can be applied to sequences of words and formal concepts, investigating a generative approximation to these distributions using a quantum circuit Born machine, and an approach to ambiguity resolution in verb-noun composition using single-qubit rotations for simple nouns and 2-qubit controlled-NOT gates for simple verbs. The smaller systems described have been run successfully on physical quantum computers, and the larger ones have been simulated. We show that statistically meaningful results can be obtained using real datasets, but this is much more difficult to predict than with easier artificial language examples used previously in developing quantum NLP systems. Other approaches to quantum NLP are compared, partly with respect to contemporary issues including informal language, fluency, and truthfulness.
翻译:本文描述一些实验, 表明自然语言处理( NLP) 中的某些任务已经可以使用量子计算机完成, 尽管到目前为止只有小数据集。 我们展示了各种主题分类方法。 首先使用明确的单词法, 将字数评分权重作为单个qubit的分数旋转来实施, 并且根据这些权重的累积, 使用量子路传动受控- NOT 门在评分中进行新的分类。 这与在量子支持矢量机器中计算内核值时使用的更可缩放的词嵌嵌入矢量编码方法相比较: 这种方法在涉及1000多个字数的分类任务上实现了平均62%的准确度, 这是迄今为止最大的量子计算实验。 我们描述了一种量子模型模型的量子概率方法, 利用量子路路路路路路路路路机调查这些分布的基因缩缩缩缩略图, 使用单方位的纯度方法, 但用单子转换方法比较容易, 而比2Qqual- 直径的直径, 能够用简单的计算机显示更多的直径径径直径直径, 。