We present RuSemShift, a large-scale manually annotated test set for the task of semantic change modeling in Russian for two long-term time period pairs: from the pre-Soviet through the Soviet times and from the Soviet through the post-Soviet times. Target words were annotated by multiple crowd-source workers. The annotation process was organized following the DURel framework and was based on sentence contexts extracted from the Russian National Corpus. Additionally, we report the performance of several distributional approaches on RuSemShift, achieving promising results, which at the same time leave room for other researchers to improve.
翻译:我们介绍了RuSemShifft, 这是一项大规模人工人工附加说明的测试,用于俄罗斯语语语义改变模型的两种长期模型:从苏联前期到苏联时代,从苏联后期到苏联时期,从苏联后苏联时期,从苏联到苏联后期。目标词由多个多方源工人附加说明。批注过程是在DURel框架之后组织的,以从俄罗斯国家公司提取的判刑背景为基础。此外,我们报告了在RuSemShift上采取的若干分配方法的绩效,取得了有希望的成果,同时为其他研究人员提供了改进的空间。