In this paper we investigated two different methods to parse relative and noun complement clauses in English and resorted to distinct tags for their corresponding that as a relative pronoun and as a complementizer. We used an algorithm to relabel a corpus parsed with the GUM Treebank using Universal Dependency. Our second experiment consisted in using TreeTagger, a Probabilistic Decision Tree, to learn the distinction between the two complement and relative uses of postnominal "that". We investigated the effect of the training set size on TreeTagger accuracy and how representative the GUM Treebank files are for the two structures under scrutiny. We discussed some of the linguistic and structural tenets of the learnability of this distinction.
翻译:在本文中,我们调查了两种不同的方法,用英文分析相对和名词补充条款,并使用不同的标签作为相对的代名词和补充剂。我们用一种算法,用普遍依赖法将一个材料与GUM树库重新贴上标签。我们的第二个实验是使用TreaTagger(一种概率决定树),以了解后名“即”的两种补充和相对用途之间的区别。我们调查了培训设置对树塔格尔准确性的影响,以及GUM树库文件对受审查的两个结构的代表性。我们讨论了这一区别学习的一些语言和结构原理。