We propose a model of the evolution of a matrix along a phylogenetic tree, in which transformations affect either entire rows or columns of the matrix. This represents the change of both lexical and phonological aspects of linguistic data, by allowing for new words to appear and for systematic phonological changes to affect the entire vocabulary. We implement a Sequential Monte Carlo method to sample from the posterior distribution, and infer jointly the phylogeny, model parameters, and latent variables representing cognate births and phonological transformations. We successfully apply this method to synthetic and real data of moderate size.
翻译:我们提议了一个沿植物基因树进化矩阵的模型,在这个模型中,变异既影响到矩阵的整个行或列,又影响到矩阵的整个行或列。这代表了语言数据词汇学和字词学两个方面的变化,允许出现新词,并允许系统性的声调变化影响整个词汇。我们采用一个“连续的蒙特卡洛”方法从后方分布中取样,并共同推导出植物遗传学、模型参数和代表同卵胎出生和声学变异的潜在变量。我们成功地将这一方法应用于中等大小的合成和真实数据。