We study the dynamical properties of a Hopf algebra Markov chain with state space the binary rooted forests with labelled leaves. This Markovian dynamical system describes the core computational process of structure formation and transformation in syntax via the Merge operation, according to Chomsky's Minimalism model of generative linguistics. The dynamics decomposes into an ergodic dynamical system with uniform stationary distribution, given by the action of Internal Merge, while the contributions of External Merge and (a minimal form of) Sideward Merge reduce to a simpler Markov chain with state space the set of partitions and with combinatorial weights. The Sideward Merge part of the dynamics prevents convergence to fully formed connected structures (trees), unless the different forms of Merge are weighted by a cost function, as predicted by linguistic theory. Results on the asymptotic behavior of the Perron-Frobenius eigenvalue and eigenvector in this weighted case, obtained in terms of an associated Perron-Frobenius problem in the tropical semiring, show that the usual cost functions (Minimal Search and Resource Restrictions) proposed in the linguistic literature do not suffice to obtain convergence to the tree structures, while an additional optimization property based on the Shannon entropy achieves the expected result for the dynamics. We also comment on the introduction of continuous parameters related to semantic embedding and other computational models, and also on some filtering of the dynamics by coloring rules that model the linguistic filtering by theta roles and phase structure, and on parametric variation and the process of parameter setting in Externalization.
翻译:我们研究一个状态空间为带标记叶子的二叉有根森林的Hopf代数马尔可夫链的动力学性质。该马尔可夫动力系统通过合并操作,描述了乔姆斯基生成语言学最简方案模型中句法结构形成与转换的核心计算过程。该动力学可分解为一个具有均匀平稳分布的遍历动力系统(由内部合并的作用给出),而外部合并与(最小形式的)侧向合并的贡献则简化为一个状态空间为划分集且带有组合权重的更简单马尔可夫链。除非通过成本函数对不同合并形式进行加权(正如语言学理论所预测的),否则动力学的侧向合并部分会阻碍系统收敛到完全形成的连通结构(树)。在此加权情形下,通过关联的热带半环上的Perron-Frobenius问题,我们获得了关于Perron-Frobenius特征值与特征向量渐近行为的结果。这些结果表明,语言学文献中通常提出的成本函数(最小搜索与资源限制)不足以使系统收敛到树结构,而一种基于香农熵的额外优化性质则能为该动力学达成预期结果。我们还讨论了与语义嵌入及其他计算模型相关的连续参数的引入,以及通过着色规则(用于建模θ角色与语段结构的语言学过滤)对动力学进行的某些过滤,并探讨了参数变异与外部化过程中的参数设定问题。