In this short note we explore what is needed for the unsupervised training of graph language models based on link grammars. First, we introduce the ter-mination tags formalism required to build a language model based on a link grammar formalism of Sleator and Temperley [21] and discuss the influence of context on the unsupervised learning of link grammars. Second, we pro-pose a statistical link grammar formalism, allowing for statistical language generation. Third, based on the above formalism, we show that the classical dissertation of Yuret [25] on discovery of linguistic relations using lexical at-traction ignores contextual properties of the language, and thus the approach to unsupervised language learning relying just on bigrams is flawed. This correlates well with the unimpressive results in unsupervised training of graph language models based on bigram approach of Yuret.
翻译:在此简短的注释中,我们探讨了基于链接语法的图形语言模型不受监督的培训需要什么。 首先,我们引入了基于 Sleator 和 Temperley [21] 链接语法正式化的语法模型所需的脱胎标记形式主义,并讨论了背景对未经监督学习链接语法的影响。 其次,我们主张设置统计联系语法正式主义,允许生成统计语言。 第三,基于上述形式主义,我们展示了使用词汇吸引法发现语言关系的尤雷特传统论文[25] 忽略了语言的背景特性,因此,非监督语言学习方法仅仅依赖大rams是缺陷的。这与基于尤雷特大写法的图形语言模型的未经监督培训的非抑制性结果密切相关。