In this paper, we discuss the development of treebanks for two low-resourced Indian languages - Magahi and Braj based on the Universal Dependencies framework. The Magahi treebank contains 945 sentences and Braj treebank around 500 sentences marked with their lemmas, part-of-speech, morphological features and universal dependencies. This paper gives a description of the different dependency relationship found in the two languages and give some statistics of the two treebanks. The dataset will be made publicly available on Universal Dependency (UD) repository (https://github.com/UniversalDependencies/UD_Magahi-MGTB/tree/master) in the next(v2.10) release.
翻译:在本文中,我们讨论了两种资源不足的印度语言----马加希语和布拉伊语----的树库的发展情况,马加希树库包含945项判决,布拉伊树库包含约500项判决,其标志是伦马语、半语、形态特征和普遍依赖性,本文描述了两种语言的不同依赖性关系,并提供了两个树库的一些统计数据,数据集将公布于《普遍依赖性》(UD)储存库(https://github.com/UniversalDependies/UD_Magahi-MGTB/tree/master)的下一期(V2.10)版。