Language documentation is a critical aspect of language preservation, often including the creation of Interlinear Glossed Text (IGT). Creating IGT is time-consuming and tedious, and automating the process can save valuable annotator effort. This paper describes the baseline system for the SIGMORPHON 2023 Shared Task of Interlinear Glossing. In our system, we utilize a transformer architecture and treat gloss generation as a sequence labelling task.
翻译:语言文献编纂是语言保护的一个关键方面,通常包括创造分析性标注的文本(Interlinear Glossed Text,简称IGT)。创建IGT是耗时且繁琐的,而自动化该过程可以节省宝贵的注释者工作量。本文描述了SIGMORPHON 2023共享任务的基线系统,我们利用了变形器(transformer)架构,并将分析性标注生成视为一个序列标注的任务。