This paper concerns the structure of meanings within natural language. Earlier, a framework named DisCoCirc was sketched that (1) is compositional and distributional (a.k.a. vectorial); (2) applies to general text; (3) captures linguistic `connections' between meanings (cf. grammar) (4) updates word meanings as text progresses; (5) structures sentence types; (6) accommodates ambiguity. Here, we realise DisCoCirc for a substantial fragment of English. When passing to DisCoCirc's text circuits, some `grammatical bureaucracy' is eliminated, that is, DisCoCirc displays a significant degree of (7) inter- and intra-language independence. That is, e.g., independence from word-order conventions that differ across languages, and independence from choices like many short sentences vs. few long sentences. This inter-language independence means our text circuits should carry over to other languages, unlike the language-specific typings of categorial grammars. Hence, text circuits are a lean structure for the `actual substance of text', that is, the inner-workings of meanings within text across several layers of expressiveness (cf. words, sentences, text), and may capture that what is truly universal beneath grammar. The elimination of grammatical bureaucracy also explains why DisCoCirc: (8) applies beyond language, e.g. to spatial, visual and other cognitive modes. While humans could not verbally communicate in terms of text circuits, machines can. We first define a `hybrid grammar' for a fragment of English, i.e. a purpose-built, minimal grammatical formalism needed to obtain text circuits. We then detail a translation process such that all text generated by this grammar yields a text circuit. Conversely, for any text circuit obtained by freely composing the generators, there exists a text (with hybrid grammar) that gives rise to it. Hence: (9) text circuits are generative for text.
翻译:本文涉及自然语言的含义结构 。 早些时候, 名为 DisCocirc 的框架被描绘为 (1) 构成和分布( a. k. a. a. a. 矢量度); (2) 适用于一般文本; (3) 捕捉含义( commar) 之间的语言“ 连接” (4) 更新字词的含义; (5) 结构句类型; (6) 包含模糊性 。 这里, 我们意识到 DisCocirc 有大量英文的拼写。 传递到 Discoc 的文本回路时, 一些“ 语系官僚” 被删除, 即 DisCirc 显示相当程度(7) 语言间和语言内部独立; (2) 适用于普通文本; (3) 包含不同语言的字系独立, 以及独立于许多短句与短句之间的选择。 这种语言间独立意味着我们的文字电路流应该传到其他语言, 与分类语系的打字机。 因此, 文本的流流流系结构是精度结构, 也可以通过某种直观的流流流体流体流体流体, 。 (这是, 的内工作流的文字可以解释, 直系, 直系 直系的文字的流 直系 直系 直系, 直系, 直系 直系 直系 直系 直系 直系的文字系 直系 直系的文字系 直系 直系 直系 。 。