The prevailing practice in the academia is to evaluate the model performance on in-domain evaluation data typically set aside from the training corpus. However, in many real world applications the data on which the model is applied may very substantially differ from the characteristics of the training data. In this paper, we focus on Finnish out-of-domain parsing by introducing a novel UD Finnish-OOD out-of-domain treebank including five very distinct data sources (web documents, clinical, online discussions, tweets, and poetry), and a total of 19,382 syntactic words in 2,122 sentences released under the Universal Dependencies framework. Together with the new treebank, we present extensive out-of-domain parsing evaluation utilizing the available section-level information from three different Finnish UD treebanks (TDT, PUD, OOD). Compared to the previously existing treebanks, the new Finnish-OOD is shown include sections more challenging for the general parser, creating an interesting evaluation setting and yielding valuable information for those applying the parser outside of its training domain.
翻译:学术界的普遍做法是评价通常从培训教材中排出的主要评价数据模型绩效,然而,在许多现实世界中,应用该模型的数据可能与培训数据的特点大相径庭。在本文件中,我们侧重于芬兰的外部分析,采用新的芬兰的外向分析系统,采用芬兰的外向分析系统,包括五个非常独特的数据来源(网络文件、临床、在线讨论、推文和诗歌),以及根据普遍依赖框架释放的总共19 382个综合词,在2 122个判决中共使用了19 382个合成词。我们与新的树库一道,利用芬兰三个不同的软化树库(TDT、PUD、OOD)的现有科级信息,提出了广泛的外部评价。与以前存在的树库相比,新的芬兰的在线数据库中包括了对一般牧师更具挑战的章节,为那些在培训领域以外应用助理人员创造了有趣的评价环境,并提供了宝贵的信息。