The syntactic behaviour of texts can highly vary depending on their contexts (e.g. author, genre, etc.). From the standpoint of stylometry, it can be helpful to objectively measure this behaviour. In this paper, we discuss how coalgebras are used to formalise the notion of behaviour by embedding syntactic features of a given text into probabilistic transition systems. By introducing the behavioural distance, we are then able to quantitatively measure differences between points in these systems and thus, comparing features of different texts. Furthermore, the behavioural distance of points can be approximated by a polynomial-time algorithm.
翻译:文本的合成行为可因具体情况而有很大差异(如作者、基因等)。从tyllogys的角度来看,客观衡量这种行为可能是有益的。在本文中,我们讨论了如何利用联数将某一文本的合成特征嵌入概率转换系统,从而将行为概念正规化。通过引入行为距离,我们可以定量测量这些系统中各点之间的差异,从而比较不同文本的特征。此外,点的行为距离可以通过多元时算法来近似。